FMUSER Wirless Transmit Video And Audio More Easier !
es.fmuser.org
it.fmuser.org
fr.fmuser.org
de.fmuser.org
af.fmuser.org ->Afrikaans
sq.fmuser.org ->Albanian
ar.fmuser.org ->Arabic
hy.fmuser.org ->Armenian
az.fmuser.org ->Azerbaijani
eu.fmuser.org ->Basque
be.fmuser.org ->Belarusian
bg.fmuser.org ->Bulgarian
ca.fmuser.org ->Catalan
zh-CN.fmuser.org ->Chinese (Simplified)
zh-TW.fmuser.org ->Chinese (Traditional)
hr.fmuser.org ->Croatian
cs.fmuser.org ->Czech
da.fmuser.org ->Danish
nl.fmuser.org ->Dutch
et.fmuser.org ->Estonian
tl.fmuser.org ->Filipino
fi.fmuser.org ->Finnish
fr.fmuser.org ->French
gl.fmuser.org ->Galician
ka.fmuser.org ->Georgian
de.fmuser.org ->German
el.fmuser.org ->Greek
ht.fmuser.org ->Haitian Creole
iw.fmuser.org ->Hebrew
hi.fmuser.org ->Hindi
hu.fmuser.org ->Hungarian
is.fmuser.org ->Icelandic
id.fmuser.org ->Indonesian
ga.fmuser.org ->Irish
it.fmuser.org ->Italian
ja.fmuser.org ->Japanese
ko.fmuser.org ->Korean
lv.fmuser.org ->Latvian
lt.fmuser.org ->Lithuanian
mk.fmuser.org ->Macedonian
ms.fmuser.org ->Malay
mt.fmuser.org ->Maltese
no.fmuser.org ->Norwegian
fa.fmuser.org ->Persian
pl.fmuser.org ->Polish
pt.fmuser.org ->Portuguese
ro.fmuser.org ->Romanian
ru.fmuser.org ->Russian
sr.fmuser.org ->Serbian
sk.fmuser.org ->Slovak
sl.fmuser.org ->Slovenian
es.fmuser.org ->Spanish
sw.fmuser.org ->Swahili
sv.fmuser.org ->Swedish
th.fmuser.org ->Thai
tr.fmuser.org ->Turkish
uk.fmuser.org ->Ukrainian
ur.fmuser.org ->Urdu
vi.fmuser.org ->Vietnamese
cy.fmuser.org ->Welsh
yi.fmuser.org ->Yiddish
Preface
H264 video compression algorithm is now undoubtedly the most widely used and most popular of all video compression techniques. With the introduction of open source libraries such as x264/openh264 and ffmpeg, most users no longer need to do too much research on the details of H264, which greatly reduces the cost of people using H264.
But in order to make good use of H264, we still have to figure out the basic principles of H264. Today we will take a look at the basic principles of H264.
H264 overview
H264 compression technology mainly uses the following methods to compress video data. include:
Intra-frame prediction compression solves the problem of spatial data redundancy.
Inter-frame prediction compression (motion estimation and compensation) solves the problem of time-domain data redundancy.
Integer Discrete Cosine Transform (DCT), which transforms the spatial correlation into irrelevant data in the frequency domain and then quantizes it.
CABAC compression.
The compressed frame is divided into: I frame, P frame and B frame:
I frame: key frame, using intra-frame compression technology.
P frame: forward reference frame, when compressing, only refer to the previously processed frame. Use frame audio compression technology.
B frame: A bidirectional reference frame. During compression, it refers to the previous frame and the following frame. Using inter-frame compression technology.
In addition to I/P/B frames, there are also picture sequences GOP.
GOP: Between two I frames is an image sequence, and there is only one I frame in an image sequence. As shown below:
Now we will describe the H264 compression technology in detail.
H264 compression technology
The basic principle of H264 is actually very simple, let's briefly describe the process of H264 compressing data. The video frames captured by the camera (calculated at 30 frames per second) are sent to the buffer of the H264 encoder. The encoder must first divide macroblocks for each picture.
Take the following picture as an example:
Partition macroblock
H264 uses a 16X16 area as a macro block by default, and it can also be divided into 8X8 size.
After dividing the macro block, calculate the pixel value of the macro block.
By analogy, the pixel value of each macroblock in an image is calculated, and all the macroblocks are processed as follows.
Sub-block
H264 uses 16X16 macroblocks for relatively flat images. However, in order to achieve a higher compression rate, smaller sub-blocks can also be divided into 16X16 macroblocks. The size of the sub-block can be 8X16, 16X8, 8X8, 4X8, 8X4, 4X4, which is very flexible.
In the above picture, most of the 16X16 macroblocks in the red frame have a blue background, and part of the image of the three eagles is drawn in this macroblock. In order to better process the partial images of the three eagles, H264 Multiple sub-blocks are divided into 16X16 macroblocks.
In this way, after intra-frame compression, more efficient data can be obtained. The figure below is the result of compressing the above macroblocks using mpeg-2 and H264 respectively. The left half is the result of compression after MPEG-2 sub-block division, and the right half is the result of H264 sub-block compression. It can be seen that the H264 division method has more advantages.
After the macro block is divided, all the pictures in the H264 encoder buffer can be grouped.
Frame grouping
For video data, there are mainly two types of data redundancy, one is data redundancy in time, and the other is data redundancy in space. Among them, the data redundancy in time is the largest. Let's first talk about the redundancy problem of video data time.
Why is the time redundancy the greatest? Assuming that the camera captures 30 frames per second, the data of these 30 frames are mostly related. It is also possible that more than 30 frames of data, tens of frames, or hundreds of frames of data are particularly closely related.
For these very closely related frames, in fact, we only need to save one frame of data, and other frames can be predicted from this frame according to certain rules, so the video data has the most time redundancy.
In order to achieve that the relevant frames compress data through the prediction method, it is necessary to group the video frames. So how to determine that certain frames are closely related and can be grouped together? Let's take a look at an example. Below is a captured video frame of a group of billiard balls in motion. The billiard balls roll from the upper right corner to the lower left corner.
The H264 encoder will take out two adjacent frames each time to compare the macroblocks in order to calculate the similarity of the two frames. As shown below:
Through the macro block scan and the macro block search, it can be found that the correlation between the two frames is very high. Furthermore, it is found that the correlation degree of this group of frames is very high. Therefore, the above frames can be divided into one group. The algorithm is: in the adjacent images, the pixels that are generally different are only within 10%, the brightness difference does not exceed 2%, and the chromaticity difference only changes within 1%. We think this The graphs can be grouped together.
In such a group of frames, after encoding, we only keep the complete data of the first post, and other frames are calculated by referring to the previous frame. We call the first frame IDR/I frame, and other frames we call P/B frame, so we call the encoded data frame group GOP.
Motion estimation and compensation
After the frames are grouped in the H264 encoder, it is necessary to calculate the motion vectors of the objects in the frame group. Taking the above moving billiard video frame as an example, let's take a look at how it calculates the motion vector.
The H264 encoder first takes out two frames of video data from the buffer header in sequence, and then performs macro block scanning. When an object is found in one of the pictures, the search is performed in the vicinity of the other picture (in the search window). If the object is found in another image at this time, then the motion vector of the object can be calculated. The following picture shows the position of the billiard ball after searching.
Through the difference between the positions of the billiard balls in the above picture, the direction and distance of the table picture can be calculated. H264 records the distance and direction of the ball movement in each frame in turn, and it becomes the following.
After the motion vector is calculated, the same part (that is, the green part) is subtracted to obtain the compensation data. In the end, we only need to compress and save the compensation data, and then the original image can be restored when decoding. The compressed data only needs to record a small amount of data. As follows:
We call motion vector and compensation as inter-frame compression technology, which solves the data redundancy of video frames in time. In addition to inter-frame compression, data compression must also be performed within the frame. Intra-frame data compression solves spatial data redundancy. Now we will introduce the intra-frame compression technology.
Intra prediction
The human eye has a degree of recognition of the image, it is very sensitive to the brightness of low frequency, and is not very sensitive to the brightness of high frequency. Therefore, based on some research, data that is not sensitive to human eyes can be removed from an image. In this way, the intra prediction technology is proposed.
The intra-frame compression of H264 is very similar to JPEG. After an image is divided into macroblocks, each macroblock can be predicted in 9 modes. Find the prediction mode that is closest to the original image.
The following picture is the process of predicting each macro block in the entire picture.
The comparison between the image after intra prediction and the original image is as follows:
Then, the original image and the intra-predicted image are subtracted to obtain a residual value.
Then save the prediction mode information we got before, so that we can restore the original image when decoding. The effect is as follows:
After intra-frame and inter-frame compression, although the data is greatly reduced, there is still room for optimization.
Do DCT on residual data
The residual data can be subjected to integer discrete cosine transform to remove the correlation of the data and further compress the data. As shown in the figure below, the left side is the macro block of the original data, and the right side is the macro block of the calculated residual data.
The macroblock of residual data is digitized as shown in the figure below:
DCT conversion is performed on the residual data macroblock.
After removing the associated data, we can see that the data is further compressed.
After DCT is done, it is not enough, and CABAC is needed for lossless compression.
CABAC
The above intra-frame compression is a lossy compression technique. In other words, after the image is compressed, it cannot be completely restored. CABAC is a lossless compression technology.
Lossless compression technology may be the most familiar to everyone is Huffman coding, a short code for high-frequency words, a long code for low-frequency words to achieve the purpose of data compression. VLC used in MPEG-2 is this kind of algorithm, we take A-Z as an example, A belongs to high frequency data, and Z belongs to low frequency data. See how it is done.
CABAC is also a short code for high-frequency data and a long code for low-frequency data. At the same time, it will compress based on context, which is much more efficient than VLC. The effect is as follows:
Now replace A-Z with a video frame, and it will look like the following.
It is obvious from the above picture that the lossless compression scheme using CACBA is much more efficient than VLC.
summary
At this point, we have finished the H264 coding principle. This article mainly talks about the following points:
1. Jianyin introduced some basic concepts in H264. Such as I/P/B frame, GOP.
2. Explained the basic principles of H264 encoding in detail, including:
Macro block division
Image grouping
Intra-frame compression technology principle
The principle of inter-frame compression technology.
DCT
CABAC compression principle.
|
Enter email to get a surprise
es.fmuser.org
it.fmuser.org
fr.fmuser.org
de.fmuser.org
af.fmuser.org ->Afrikaans
sq.fmuser.org ->Albanian
ar.fmuser.org ->Arabic
hy.fmuser.org ->Armenian
az.fmuser.org ->Azerbaijani
eu.fmuser.org ->Basque
be.fmuser.org ->Belarusian
bg.fmuser.org ->Bulgarian
ca.fmuser.org ->Catalan
zh-CN.fmuser.org ->Chinese (Simplified)
zh-TW.fmuser.org ->Chinese (Traditional)
hr.fmuser.org ->Croatian
cs.fmuser.org ->Czech
da.fmuser.org ->Danish
nl.fmuser.org ->Dutch
et.fmuser.org ->Estonian
tl.fmuser.org ->Filipino
fi.fmuser.org ->Finnish
fr.fmuser.org ->French
gl.fmuser.org ->Galician
ka.fmuser.org ->Georgian
de.fmuser.org ->German
el.fmuser.org ->Greek
ht.fmuser.org ->Haitian Creole
iw.fmuser.org ->Hebrew
hi.fmuser.org ->Hindi
hu.fmuser.org ->Hungarian
is.fmuser.org ->Icelandic
id.fmuser.org ->Indonesian
ga.fmuser.org ->Irish
it.fmuser.org ->Italian
ja.fmuser.org ->Japanese
ko.fmuser.org ->Korean
lv.fmuser.org ->Latvian
lt.fmuser.org ->Lithuanian
mk.fmuser.org ->Macedonian
ms.fmuser.org ->Malay
mt.fmuser.org ->Maltese
no.fmuser.org ->Norwegian
fa.fmuser.org ->Persian
pl.fmuser.org ->Polish
pt.fmuser.org ->Portuguese
ro.fmuser.org ->Romanian
ru.fmuser.org ->Russian
sr.fmuser.org ->Serbian
sk.fmuser.org ->Slovak
sl.fmuser.org ->Slovenian
es.fmuser.org ->Spanish
sw.fmuser.org ->Swahili
sv.fmuser.org ->Swedish
th.fmuser.org ->Thai
tr.fmuser.org ->Turkish
uk.fmuser.org ->Ukrainian
ur.fmuser.org ->Urdu
vi.fmuser.org ->Vietnamese
cy.fmuser.org ->Welsh
yi.fmuser.org ->Yiddish
FMUSER Wirless Transmit Video And Audio More Easier !
Contact
Address:
No.305 Room HuiLan Building No.273 Huanpu Road Guangzhou China 510620
Categories
Newsletter