"As a video website, with the increase in the video type of the B station, the cost of the website is increased, take into account the cost, and choose an ultra-low yield encoder. This article is from the technical expert of the B station video cloud technology department. Ye Tianxiao's wonderful sharing on the LivevideostackCon2019 Beijing station, detailing the H.265 software encoder (Yhevc) research and development history of B stwatch, and some optimization and practices for actual on-demand and live broadcast business.
Text / Ye Tianxiao
Squiring / Livevideostack
Be
Hello everyone, I am from the technical expert of the B station video cloud technology department Ye Tianxiao. The theme of today and everyone is the practice and application of the B station H.265 encoder in live and on-demand.
Be
Be
First introduce yourself, I first contact video decoding is in 2004, when the H.264 protocol has just been completed, my main research direction is the transcoding of H.263, 264 and the encoding of H.264. After returning to China, I joined a number of domestic companies in China to engage in video algorithms. In 2018, join the B station, current video cloud technical experts, responsible for video editing decoding teams. Currently developed in the B station and launches self-developed H.265 encoder (YHEVC), applied to on-demand and live business.
1. Self-developed H.265 encoder development background
The content of this speech is divided into five parts. The first part introduces the reason for the self-developing encoder. The second part introduces the methodology of the video encoder. Share some experience in the engineering angle, the third part Introduce the status quo of the self-developed encoder and the application scenario. Finally, introduce some special optimizations of the encoder on live and on-demand.
Be
Early B stations do not do video transcodes, the video uploaded by UP mains directly to users, but this operational process will bring a lot of drawbacks. First, the UP main video has many coding standards, such as RealMedia, H.263, H.264. And the same coding standard also has a variety of Profiles, such as Baseline, Main, High, and different levels. Second, the UP main video has various package formats, such as MP4, FLV, MOV, MKV, and more. When the user watches different packages, different coding formats, various play problems may occur, such as mobile phone does not support one Profile or Level, or a player is not good for some kind of package, it will lead to problems such as carton, black screen. The current B station will transcatulate the most UP main video to circumvent the playback compatibility problem described above.
Be
1.1 B site broadcast business
Be
Over the past two years, the B-station single-day user has doubled, and the company has done a lot of strategies to incentive the UP main upload video. In January 2018, the B station launched the UP main incentive plan, and the training UP owner made a better video to attract users. In February, the B station supports 60fps and 6 megaw rate video viewing, and improves the video viewing quality of the payment. In August, the B station supports Dash technology and seamless switching between multi-resolution. At the end of 2018, the B station increased the upload file volume limit of the UP main single video from 4GB to 8GB, so that the UP main uploaded a better source video, so that our transcoded video quality has also become better. In June 2019, the B station supported 4K resolution in PGC.
Be
1.2 B station live broadcast business
In terms of B-station live, the above picture of this 1080p60fps 6 megaw rate eating chicken video will bring severe carton to weak net users, so the B station uses real-time transcoding scheme, which reduces the yard ratio to 2.5 mega. Or below, provide low rate video streams to meet the needs of various users.
Be
1.3 B station cost pressure
Be
Be
The B station has gradually become a video kind of Chinese version of the Chinese version of the video type. As the number of video of the B station is much, the frame is large, the length is long, and the cost of the B station has been rapidly growing, which mainly includes bandwidth cost, storage cost, and transcoding power cost.
Be
1.4 Reasons for Self-developing H.265 Video Encoder
Be
Be
The H.265 protocol is developed in 2013, which can save 50% compared to H.264, 90% of mobile phones on the city surface have supported H.265 hardware decoding. In order to save costs, there are three options to H.265 on the B station: First, use the open source X265 encoder, but its power requirement is high, the transcoding efficiency is low; the second is the cloud manufacturer transcoding service, but this will lead to transcoding The cost is very high; the third is the self-research encoder, and it is also our final option. Self-research software encoders need strong development capabilities, but its controllability, customizable, customizable to B Station business It is extremely advantageous, from the long run, the lowest cost program. In addition, in addition to the encoding, it is a natural data generator. It can output a variety of customizable information, more combined with AI technology. Optimization solution.
Be
2. How to make a video encoder
Be
2.1 How to do video encoders
Be
Be
R & D video encoders can be considered to be optimized in a three-dimensional space. These three dimensions are: picture quality, complexity, code rate. There are two limit points in this optimized area: 1. If you do not consider complexity, make a picture quality encoder, you can use the H.265 reference code software hm. 2. If you want to do a speed Fast encoders, you can transform a decoder directly to skip all high complexity encoding processes. These two limit points can be considered to be available, and there are ready-made code available. And general video encoder development The difficulty is to optimize in the middle of these two limit points, which requires a large R & D investment.
Be
2.2 How to do video encoders
Be
The R & D environment of the video encoder is mainly composed of three parts: video decoders, test architecture, data analysis tools. During R & D YHEVC, we make modifications with HM decoders to quickly locate the problem of a code module as a quality evaluation and encoder authentication tool. Our test architecture is composed of Python scripts and databases. When testing, script call encoders, decoders, and data analysis tools, cure the results of a variety of test scenarios to the database, verify new IDEA Whether it is effective. Our data analysis tool consists of Python scripts and MATLAB, mainly for data visualization.
Be
2.3 R & D process of video encoder
The development process of the video encoder can be roughly divided into three steps: first is correct code, that is, the encoder and decoder do full match; followed by efficient encoding, the development of excellent and efficient algorithms to accelerate the encoder and guarantee the code quality; Finally, the business combined encoding is based on the business scene and the encoded PIPELINE to optimize the adaptation.
Be
2.4 YHEVC correctly encoded test method
According to our experience, correctly encoding can be verified by two types of tests: 1. Alignment test between encoders and decoders, two encoders, and encoders themselves. In the first type of test, the test of the entropy encoding layer is mainly verified to verify the correctness of the NAL layer code. The test layer test is to verify the correctness of the IDCT and Inverse Quantization code. The predictive layer test is to verify, for example, Inter-PRIiction and The correctness of the Motion Composition code. In the second type of test, Debug and Release, X86, and Linux tests can verify the problem of data Alignment; multithreading and single-threading, FIRST TIME and SECOND TIME testing are to verify the problem of encoding DETERMINISTIC.
Be
2.5.1 Yhevc correctly encoded test example
Be
The following example will be the correct encoding test of three YHEVCs.
Be
The first example is the Cusize associated with Tusize. Each line in the table represents a configuration of the CUTU, where the first line of the table is the default configuration recommended by most encoders using the H.265 protocol. When we develop Yhevc, we not only tested this default configuration, but also verified the Quality of all the CUTU settings allowed by the H.265 protocol to improve the strength of the Yhevc encoder.
Be
2.5.2 Yhevc correctly encoded test example two
The second example is related to FrameSize. Yhevc's coding test can be used by the smallest frameSize is 8 × 8, comparing the minimum 64x64 allowed by X265, and our YHEVC encoder can discover BUG earlier. Example The FrameSize and the example of the four miles The following CUTU configurations can be interleaved to improve the strength of the YheVC encoder.
Be
2.5.3 Yhevc correctly encoded test example three
Be
Example 3 is a random test of the encoder. The left is a video analysis map of normal encoding. The right picture is a video analysis map in random mode. Test, open the YhevC encoder's Debug switch, you can put QP, mode in CU The choice, Cu size, Tu Depth, MotionVector (MV), etc. are set to random values to verify the strongness of the encoder. Here, the MV, a lot of MV in the random mode in the right, which can verify that the synchronization mechanism when multiframe parallel encoding is correct.
Be
2.6 YheVC encoder algorithm classification
Be
Be
Based on complexity and image quality, the encoder can be roughly divided into four categories: 1. Algorithm for complexity unchanged picture quality. Such algorithms are generally pre-analytical and code control related algorithms, their complexity is not High, but can greatly improve the quality. Second, the complexity and image quality have improved algorithms. This type of algorithm is divided into two types: increased coding tools, such as Weighted Motion Estimation; increasing complexity, such as RDOQ. Third, complexity, and picture quality reduced algorithm. It is a commonly known rapid algorithm that can be referred to a number of PAPERs, such as TU, PU, and CU quick segmentation strategy. IV, complexity reduces the algorithm of the image quality. Such algorithm is engineered, such as C language, assembly language optimization, multi-threaded Optimization.
Under different calculations, the algorithm combination can be selected from the algorithm of these four categories to achieve the optimal image quality effect, so the optimum algorithm is also a research and development difficulty in these four algorithms.
Be
3. Auto Encoder Status
Be
Be
3.1 Status of Auto Encoder
About the status quo of the own encoder, the YHEVC encoder is written from 0, including more than 40 quick algorithms, more than 80 configuration parameters. Our encoder supports mainstream encoding tools and multiple PRESET gear, which is now used for the on-demand and live broadcast of B stations. Compared to X265, YheVC can reach 3 to 10 times the encoding speed under the same image quality.
Be
3.2 YheVC encoder on demand business performance
Be
Test results of the on-demand service: The left picture PSNR reflects that the YHEVC is 0.3 ~ 0.4 at the same rate at the same code rate; the right picture Speed reflects that YHEVC can reach the x265 coding speed.
Be
3.3 Live Business Performance of YheVC Encoder
Be
The results of the live broadcast business can be seen that Yhevc's Ultrafast can reach the QERYFAST encoding quality of X265 and reach 2 times coding speed.
Be
3.4 YheVC encoder online B station history
Be
B stations began to trial online Yhevc in the on-demand and live transcoding system, and start online in January 2019. Most of the video that B stop mobile phone users now have used H.265 format. Today's H.265's traffic has exceeded 50%. It is expected that the end of the B-site Single H.265 flow rate will reach 80%.
Be
4. Video encoder optimization for B-station live broadcast services
4.1 The pain points of video encoders in live broadcast
Be
This chapter introduces YheVC to optimize the business scenario under live broadcast. Software encoder has a big pain point in live broadcast, that is, the coding speed is not constant, which is mainly due to the fast algorithm inside the software encoder. For example, this example, the software coding speed of the complex scene will drop below 60fps, resulting in playing the audience to play carton.
Be
4.2 Factors and solutions of video encoders in live pain
Be
Be
This pain is caused by a number of factors: First, the complexity of the transcoding video is very different, such as the complexity of football competition and video conference; second, transcoding target resolution, calculus, and frame rate The combination of combination will result in different requirements for the transcoding force; again, the CPU calculation is changed during the transcodation, such as when doing three live transcodes on a server, there will be one between the CPUs. Resource preemption problem.
Be
There are two traditional solutions for this pain point: 1. For a certain rate, frame rate, partition, setting a coding gear that ensures real-time encoding, but this will cause an increase in code rate to rise. 2. For each video, it is debugging setting a coding gear that can guarantee real-time, but this requires a lot of manpower costs, and sometimes it is not necessarily effective.
Be
4.3 Complexity Adaptive Video Encoders Solve the Point of Live in Live
Be
For the previous pain points, we develop a complex adaptive video encoder that controls the gear analysis module and the gear position in the encoded module and the encoding module within the encoder according to the state and the output FPS. .
Be
4.4 Complexity adaptive video encoder effect
Be
Figure 1 is an effect showing a complexity adaptive video encoder, the horizontal axis is a time axis, and the longitudinal axis is encoded FPS. After adaptive encoding, you can reduce the encoding complex gear position under complex scenes to avoid encoding the card issues.
Figure 2, Figure 3 is a coding mass and a coding complexity switching diagram. When the complexity is high, the adaptive encoder willDump PRESET between Ultrafast and Veryfast; the adaptive encoder will cut the preset back to SLOW when complexity is lowered.
Be
4.5 Complexity Adaptive Video Encoder Effect
We also monitor the complexity adaptive video encoder, you can get five information on each video: average PRESET, the actual frame rate is less than 70% of the target frame rate, the actual frame rate is below the target frame rate 50 % The number of times, the average PSNR, the average SSIM.
Be
5. Video encoder is optimized for B station live broadcast business
5.1 multi-resolution transcoding process
Be
5.2 Multi-resolution transcoding process shared by 1PASS algorithm
Be
5.3 Sharing 1Pass Video Encoder Design Ideas
Be
Last introduction to the optimization of the video encoder for the B site-cast business. Since our on-demand uses multi-resolution 2Pass coding. We have been developed, you will first make a 1PASS intermediate file for 720P, then use this file to do 2PASS more Rate transcoding can reduce the complexity of 36% of the entire coding process.
Be
6. Future plan
Be
Our next plan is: machine learning and encoder combining, Content Based Video Encoder, new coding format AV1, VVC, video front and rear processing technology, etc.
Be
We are ready to continue to increase research and development, with better codec technology to provide a better viewing experience to the B station users.
Livevideostack Autumn Recruitment
LivevideOstack is recruiting editing / journalist / operation, together with the world's top multimedia technical experts and Livevideostack young partners, driving multimedia technology ecological development. At the same time, you are also welcome to use our spare time, and participate in content production. For information, please hide "LivevideOstack" directly in BOSS, or through WeChat "Tony_BAO_" and the editor-in-chief. "
Our other product: