"Original address: http://blog.csdn.net/yangzhongXuan/Article/details/8003494
Glossary
Field and frame: one or one frame of the video can be used to generate an encoded image. In the TV, in order to reduce the large area of flashing, divide a frame into two individuals.
Table: In each image, several macroblocks are arranged in the form of a film. The tablets are divided into I, B, P, and other slices.
The I slice includes only I macroblocks, and the P sheets can include P and I macroblocks, while the B tablets can include B and I macroblocks.
The I macroblock uses a pixel that is decoded from the current piece to intra prediction.
The P macroblock uses the previously encoded image as a reference image for intra prediction.
The B macroblock uses a two-way reference image (the previous frame and the latter frame) for intra prediction.
The purpose of the film is to limit the diffusion and transmission of the error, making the encoders independently.
A prediction of a piece cannot be referred to in the macroblock in other slices, so that the prediction error in a piece will not propagate into other slices.
Macroblock: An encoded image is usually divided into several macroblocks, and a macroblock consists of a 16 × 16 brightness pixel and an additional one 8 × 8 CB and an 8 × 8 CR color pixel block.
The relationship between data:
In the H264 structure, the data after the video image encoding is called one frame, and one frame consists of a slice or a plurality of sheets, a piece consisting of one or more macroblocks (MB), a macroblock by 16x16 YUV The data is composed. The macroblock is the basic unit of H264 encoding.
Three different data forms in the H264 encoding process:
SODB data bit string ----> The most original encoded data, ie VCL data;
RBSP Original Nature Sequence Load ----> After the SODB is filled with the end bit (RBSP Trailing Bits, a bit "1") several bits "0", so that bytes are aligned;
EBSP Extended Number Sequence Loads ----> Fixed the imitation check byte (0x03) based on RBSP, which is: When NALU is added to AnnexB, you need to add the start code startcodeprefix before each group of NALUs. If the corresponding SLICE corresponding to the NALU is represented by a 4-bit byte, OX00000001, otherwise the OX000001 (part of a frame) is represented by 3 bits. Further, in order to make the NALU body conflict with the start code, when encoding, each of the two bytes is continued to be 0, the 0x03 of one byte is inserted. Remove 0x03 when decoding. Also known as a shell operation.
H264 / AVC hierarchical structure
The main goal of H.264 is:
1. High video compression ratio;
2. Good network affinity;
In order to accomplish these target H264, it is:
1.VCL Video Coding Layer Video Coding Layer;
2.NAL NetWork Abstract Layer Network extraction layer;
Be
Among them, the VCL layer is the definition of the grammar level of the core algorithm engine, block, macroblock, and the film, and he finally outputs the data SODB encoded.
NAL layer defines the grade level of grade (such as sequence parameter set and image parameter set, for network transmission),
At the same time, the following functions are supported: the independent piece decoding, the start code is uniquely guaranteed, SEI and stream format encoded data transfer, the NAL layer packed SODB into RBSP and then add NAL heads to form a NALU (NAL unit);
H264 network transmission structure
The H264 is transmitted in the network NALU, NALU structure is: NAL head + RBSP, the data flow in the actual transmission is shown in the figure:
Be
Be
Be
Be
The NALU head is used to identify what type of data behind the RBSP, and whether he will be used by other frame references and whether there is an error in network transmission.
NALU head structure
Length: 1BYTE forbidden_bit (1bit) + nal_reference_bit (2bit) + NAL_UNIT_TYPE (5BIT)
1.Forbidden_bit: Forbidden bits, initial 0, when the network finds that the NAL unit has a bit error, which can be set to 1 in order to correct or drop the unit.
2.NAL_REFERENCE_bit: NAL importance indication, the mark of the NAL unit, the greater the value, the more importantly, the decoder can lose the NALU of 0. when decoding processing, can throw away NALU.
The importance of different types of NALU indicates as shown in the table below. Be
NAL_UNIT_TYPE NAL Type NAL_REFERENCE_BIT 0 No 0 1 non-IDR sheet This piece belongs to the reference frame, no equal to 0, does not belong to the reference frame, then the like, the 0 2 pieces of data A partition is the same as the upper 3 pieces of data C partitions. Slices of the same 5 IDR images 5 6 Supplemental Enhancement Information Unit (SEI) 0 7 Sequence Parameter Set Non 0 8 Image Parameter Collection Non 0 9 Different Degree 0 10 Sequence End 0 11 Code Flow End 0 12 Filled 0 13..23 Reserved 0 24..31 does not retain 0
The so-called reference frame is a frame that requires a reference when decoding other frames. For example, an I frame may be referenced by one or more B frames, and one B frame may be referenced by a P frame.
From this table, we can also see that DIR's I frame is very important. He is lost, then all frames of this sequence have no way to decode;
The sequence parameter set and the image parameter set are also very important. There is no sequence parameter set, and this sequence has no law;
There is no image parameter set, and the frame that uses this image parameter set is not a law solution.
3.NAL_Unit_type: NALU Type Values As shown in the table below.
NAL_UNIT_TYPE NAL Type C 0 Unused 1 Non-IDR Image Do not employ data partitioning in a non-IDR image Class A data division segment 2 3 Non-idR Image Class B Data Dividese Segment 3 4 Non-IDR Image Class C Data Division Fragment 4 5 IdR Image Slip 2,3 6 Supplemental Enhancement Information Unit (SEI) 5 7 Sequence Parameter Set 0 8 Image Parameter Set 1 9 Different Degree 6 10 Sequence End 7 11 Code Full 8 12 Filled 9 13 ..23 Reserved 24..31 does not retain (used when RTP packages)
RTP-packaged extension type
24STAP-ASingle-time aggregation packet25STAP-BSingle-time aggregation packet26MTAP16Multi-time aggregation packet27MTAP24Multi-time aggregation packet28FU-A Fragmentation unit29FU-BFragmentation unit30-31undefined
RBSP
RBSP data is one of the following table
The global information of the parameter set PS sequence is written in the RBSP type, such as image size, video format, etc. Enhanced information SEI video sequence decoding enhanced information image deficit PD video image boundary encoding slice SLICE encoder head information and data data division DP The data of the film is used for the error recovery decoding sequence end. The end of the sequence indicates that the next image is the IDR image stream end. Indicates that there is no image in the stream that is filled with data complex data for padding bytes.
From the previous analysis, we know that the VCL layer comes out is the encoded video frame data.
These frames may be i, b, p frames, and these frames may belong to different sequences, and the same sequence also has a set of sequence parameter sets and picture parameter sets, etc.
Therefore, to complete the video decoding, not only the video frame data encoded by the VCL layer, but also the sequence parameter set, image parameter set and other data.
Parameter set: including sequence parameter set SPS and image parameter set PPS
The SPS contains parameters for a continuous encoding video sequence, such as identifier SEQ_PARAMETER_SET_ID, frame number, and POC constraints, number of reference frames, decoded image size, and frame field encoding mode, and the like.
PPS corresponds to a sequence in a single image or a few images,
Its parameters such as identifier PIC_PARAMET_SET_ID, optional seq_parameter_set_id, entropy coding mode selection identification, slice number, initial quantization parameters, and gofield filter coefficient adjustment identification, etc..
Data segmentation: The encoded data of the form sheet is stored in 3 independent DP (data segmentation, A, B, c), each containing a subset of the encoder.
Split a contains each macroblock data in the sheet and the slice.
Split B contains encoded residual data of the intra and Si-chip macroblock.
The split C contains the encoding residual data of the frame macroblock.
Each split can be placed in a separate NAL unit and transmits independently.
The beginning and end of NAL
The encoder puts each NAL independently, and puts a group in a packet, because the packets have a header, and the decoder can easily detect NAL's boundary and remove NAL to decode.
There is a start code 0x00 00 01 (or 0x00 00 00 01) before each NAL, the decoder detects each start code, as a starting identifier of NAL, when the next start code is detected, the current NAL ends .
At the same time, H.264 provides that when 0x000000 is detected, the end of the current NAL can be characterized. So what should I do when data is 0x000001 or 0x000000 in NAL? H.264 introduces a prevention of competition mechanisms. If the encoder detects that NAL data exists 0x000001 or 0x000000, the encoder will insert a new byte 0x03 before the last byte, which:
0x000000-> 0x00000300
0x000001-> 0x00000301
0x000002-> 0x00000302
0x000003-> 0x00000303
When the decoder detects 0x000003, the 03 is discarded and the original data (shell operation) is restored. When the decoder is decoded, first read the NAL data, statistically NAL's length, and then start decoding.
NALU's order requirements
The H.264 / AVC standard is strictly required for the NAL unit order sent to the decoder. If the order of the NAL unit is confusing, it must be reinvised into the decoder after the specification organization, otherwise the decoder cannot decode correctly. .
Be
1. Sequence parameter set NAL unit
You must transfer all other NAL units that are referenced in this parameter set, but repeated sequence parameter set NAL units are allowed in the middle of these NAL units.
The so-called repetitive detailed explanation is that the sequence parameter set NAL unit has its own dedicated identifier, and if the identity of the two sequence parameter set NAL units, it can be considered that the next copy, not a new sequence parameter. set.
Be
2. Image parameter set NAL unit
It must be transmitted prior to all other NAL units with reference to this parameter, but the repeated image parameter set NAL unit is allowed to occur in the middle of these NAL units, which is the same as the sequence parameter set NAL unit described above.
3. Slice cells and data division segments in different basic encoded images cannot intersect each other in order, ie, a series of segments (SLICE) units and data parties that belong to a basic encoded image are not allowed. A segment (SLICE) unit segment and data division segment of another basic encoded image appear in the Data Partition unit.
4. Reference image Influence: If an image is referred to another image, all segments (SLICE) units and data partition units belonging to the former must be subsequently subsequently subsequent. This rule must be observed whether the basic encoded image is also a redundant encoded image.
5. All segments of the basic encoded image (SLICE) unit and data division segment must be before the segment (SLICE) unit and data division segment belonging to the corresponding redundant encoded image.
Be
6. If a continuous reference basic encoded image in the data stream occurs, the image number is small in front.
Be
7. If Arbitrary_SLICE_ORDER_ALLOWED_FLAG is set to 1, the order in which the segment in a basic encoded image is arbitrary, if the arbitrary_slice_order_allowed_flag is set to zero, then according to the first macroblock in the segment The location is to determine the order of the fragment. If data division is used, the Class Class Class Class Class B data divided clips before Class C data divide the fragment, and the data division of different segments cannot be mutually Cross, and cannot cross each other with a fragment without data division.
8. If there is a SEI (supplemental enhancement information) unit, it must be before it must be subjected to the previous basic encoding of the segment (Slice) unit and the data partition unit of the basic encoded image it. The image of all fragments (SLICE) units and data partition units are behind. If the SEI belongs to a plurality of basic encoded images, the order is only referred to in the first basic encoded image.
9. If an image segment is present, it must be prior to all SEI units, basic encoded images, the SLICE unit and the data division segment (Data Partition) unit, and followed by the previously basic encoded image NAL units.
Be
10. If there is a sequence end value, and the sequence end of the sequence end is an image, the image must be an IDR (instant decoder refresh) image. The location of the sequence ending correspond should be before the division of this IdR image, the SEI unit and other data, and the NAL unit of the previous images is followed. If there is no image after the sequence end of the sequence, then it is after all image data in the bitstream.
Be
11. The end of the stream end in the bitstream.
There are two packages, one is annexb mode, traditional mode, STARTCODE, SPS, and PPS are one of the ES is MP4 mode. General MP4 MKV will have, no startcode, sps, and PPS, and other information being encapsulated in Container. In front of each frame is the length of this frame, many decoders only support an ANNEXB mode, so you need to convert MP4: H264_MP4TOANNEXB_FILTER can do conversion implementation in FFMPEG: Sign up filter avcbsfc = av_bitstream_filter_init ("" h264_mp4toannexb "" ; conversion bitstream av_bitstream_filter_filter (AVBitStreamFilterContext * bsfc, AVCodecContext * avctx, const char * args, uint8_t ** poutbuf, int * poutbuf_size, const uint8_t * buf, int buf_size, int keyframe) "
Our other product: