FMUSER Wirless Transmit Video And Audio More Easier !

[email protected] WhatsApp +8618078869184
Language

    Hand Tao H265 Code Algorithm and Engineering Optimization

     

    "Hand Amoy's products have been expanded to live broadcast, short video and other fields, and there are a large number of pictures and videos in the business; Since 2015, the number of pictures and videos has increased exponentially; Moreover, with the development of network technology and consumption upgrading, users' requirements for picture and video definition are also increasing. From the earliest 360p to the current 4K or even 8K, these have brought a large-scale increase in broadband cost and storage cost. That is why we hope to support the sound development of audio and video services with h265 technology. 2. H.265 introduction 2.1 cost effectiveness With the help of h265 technology, we can achieve effective cost control of audio and video services in large-scale concurrent scenarios. Bandwidth and storage costs should not be underestimated. Compared with h264, taking h265 technology in typical 4K scenarios as an example, it can save up to 50% bandwidth while maintaining the same image quality, while taking 720p live broadcast as an example, using h265 can save 30% traffic. More than 90% of the current traffic comes from mobile terminals (mobile phones) rather than traditional PCs, servers, pads, etc; For mobile phones, especially Android devices, the limited computing power and uneven performance levels caused by fragmentation limit many optimization and functions on h265. If we directly apply h265 to the mobile terminal, there will be phenomena such as high heating, high energy consumption and failure of real-time coding during decoding; At the same time, there is also a lack of fast, efficient and mature end-to-end coding schemes, which are urgent problems to be solved. 2.2 coding framework Next, the coding framework of h265 is introduced. H265 coding framework consists of four modules: when inputting multiple consecutive frames of a video, the first thing to go through is the prediction module including intra prediction and inter prediction; Then, the transform quantization module performs DCT and quantization for the difference between the original image block and the predicted image block; Next, the decoding module decodes the image for prediction of the next frame; Finally, the entropy coding module will perform arithmetic coding for the prediction information and residual coefficients, so as to further eliminate the coding redundancy. 2.3 technical highlights What are the main technical highlights of h265? Firstly, h265 has a flexible coding structure with multiple subdivision coding units such as cupun and CTUN; Secondly, the block size of h265 is also very flexible, including 4 × 4、8 × 8 and other more block division methods; In addition, the sample adaptive offset (SAO) technology of h265 has higher cost performance than deblock and alf; Finally, the parallelization design is included in the whole h265, which makes the advantages of h265 more obvious. In addition to the above technical highlights, compared with h264, h265 has made a lot of improvements and Optimization in interpolation, MV prediction, intra prediction, transformation, deblocking filtering and so on. The table on the right side of the figure above shows the quantization data that h.265 has improved in many aspects compared with H.264. 2.4 improvement cost H. 265 has not only brought technical improvement to audio and video, but also brought the cost of improvement that can not be ignored. The saving of code stream means the improvement of computational complexity. As can be seen from the statistical table on the right of the above figure, the coding complexity of h265 increases by about 3-4 times and the decoding complexity increases by nearly 50% compared with H264. This means that the traditional hardware and software solutions for H.264 will obviously be unable to handle h.265. We need to properly solve many technical challenges caused by the solution upgrade. 3. Implementation of h.265 high efficiency codec Even so, the advantages of h.265 can not be ignored. The above figure shows a survey we carried out in 2017. Taking Jinshan hevc decoder as an example, compared with the standard h.265 decoder, Jinshan hevc decoder has great advantages in decoding speed and decoding quality. The survey results also make us more confident in the future development of h.265. 3.1 RDO optimization What exploration has hand Amoy made in the field of h.265 encoding and decoding? Our optimization exploration is mainly divided into two parts: algorithm optimization and engineering optimization, and algorithm optimization mainly focuses on RDO. Because hevc supports different CTU / Cu / PU / TU combination modes, the optional coding modes increase sharply; When searching for the optimal coding mode, with the increase of the number of coding modes, rate distortion calculation will become the calculation bottleneck in the coding process. This is mainly because the rate distortion optimization based on SATD in traditional H.264 can not be used in h.265, which requires more accurate distortion cost calculation. In order to optimize RDO for more efficient processing, we have made the following 7 improvements: Efficiently predict Cu levels. The Cu traversal early termination strategy based on texture information is adopted. The nonlinear problem of image segmentation is solved based on convolutional neural network. The residual AZB block is pre judged in advance, so as to reduce the calculation of D and r. A fast calculation model using quantization error D and residual bit number r. The me fast calculation model based on monotonicity is adopted. Quickly select the appropriate intra prediction mode in 35 modes. 1) Mode division CTU / Cu / PU / TU Taking Pu as an example, H.264 has 7 partition modes, while h.265 has 24 partition modes. If all the division modes that can be selected for an image in h.265 are counted, a block has up to 384 selectable division modes. The best partition scheme can only be determined after all choices are calculated. How to shorten the time required to calculate so many choices? 2) RDO optimization Fast mode decision-making - depth prediction Firstly, because there is a strong correlation between the division level and division depth of each block and the reference frame block of the current block, using the temporal and spatial correlation, we can estimate the depth range of the block from the depth of the reference block and obtain min depth and max depth; Secondly, even if this block is related to the previous block or reference block, this block also has some unique information that can be used. We can define the precision range of depth by combining the motion and texture information of this block to determine the precision range of specific division depth. Fast mode decision -- texture corner detection Through the detection of image texture, we can quickly select the optimal partition mode. Quickly identify flat or strong contrast textures to improve the division efficiency. Fast mode decision-making -- CNN classification When dealing with the texture with corner, we can realize fast selection by judging whether to continue the division according to the intensity, which has a good effect on the processing of linear changes. In practical application, we need to face many application scenarios with nonlinear changes. At this time, we need the CNN convolution neural network + deep learning model to classify the texture of nonlinear changes and select the fast mode, The linear analysis and nonlinear analysis are carried out separately. Quick mode decision - AZB decision AZB (all zero block) is a block with all zero coefficients after quantization. Through AZB early discrimination decision, blocks can be quickly classified, so as to reduce the calculation of D and r. Distortion & bits estimation Looking at the calculation process of the whole inferior frame, the calculation of DP requires the SSE between the original image and the reconstructed image, that is, to complete the prediction, transformation, quantization, inverse quantization, inverse transformation and reconstruction of coding mode P. In order to avoid such a long and complex calculation process, we can calculate the residual energy in the frequency domain after transformation and quantization; The rate statistics is realized by one-time entropy coding. In order to improve the computational efficiency, we can establish a linear estimation model for the rate statistics of residual data, and estimate the rate according to the quantized characteristics of NxN transform matrix, which can reduce the amount of calculation by nearly 50%. FME optimal search location estimation Here, the main optimization is to estimate the optimal 1 / 4 pixel point through the sad value and coordinates of the whole pixel point and 1 / 2 pixel point, so as to speed up the whole search process. Fast frame rate prediction method We adopt a set of intra prediction fast decision-making method based on Bayesian model, which can double the intra prediction speed and reduce the loss to 0.01dB. 3) Rate control optimization We use the following strategies to optimize rate control and lookahead. First, we adjust cuqp based on cutree information transmission, second, we adjust IBP frameqp based on rates and complexity, and finally, we optimize slice type decision based on reference strength. I will describe this part in livevideostackcon 2018 in more detail. 4) Reference frame optimization - long term reference frame As we know, most of the reference frames are in the range of one GOP, which often has better coding quality. The reference frame helps to improve the quality of the referenced frame, while the long-term reference frame can come from multiple GOPs. This enables the long-term reference frame to effectively reduce the loss caused by the transmission of information through multiple frames for live scenes with few changes in the background. The reference to the long-term reference frame can increase the average eV by about 0.25dB, The figure above shows the proportion of calculation of each module. The above is a number of explorations we have made in the field of RDO. 3.2 engineering optimization method In terms of engineering, we have also carried out a number of optimizations. First, we have optimized the assembly calculation of some special functions. By using neon instruction set optimization, we have improved the typical computing performance by 2 ~ 4 times, such as RDO (SSE, sad) module, motion search module, intra prediction module, etc; Secondly, for the multi-core processors widely used in mobile terminals, we optimize multi-core parallel computing and adapt to the architecture of modern processors; In addition, we also optimize the instructions and memory access of bottleneck to further improve the overall performance. 3.3 optimization results 1) Software coding After optimization at two levels of algorithm and engineering, we have brought significant performance improvement to hevc codec. In terms of coding speed, hand Amoy is more than three times higher than the version of x.265 at the beginning of 17, and can realize real-time coding of 720p & 30 frames on iPhone 6; If you want to achieve the same coding quality, the coding rate of hand Amoy is reduced by at least 15% compared with x.265. The above figure shows the specific test results. 2) Software decoding Soft solution optimization focuses more on engineering, that is, the neon instruction set, and less on the algorithm level. By optimizing the neon instruction set and rewriting some logic, Amoy can improve the video decoding speed by more than 150% compared with ffmpeg. For example, the test case of 1Mbps 720p h.265 is used for the decoding test of Xiaomi 5 mobile phone, the decoding speed can reach more than 200 frames, and the CPU occupancy is controlled below 20%; In terms of image processing, for example, we have also made many optimizations and improvements on the standard I frame processing of h.265. The picture decoding speed of APG is more than 70% higher than that of ffmpeg. 4. Summary and future outlook Here we briefly look forward to the future of video coding standards. In the past 10 years, everyone has followed the standards of H.264 and h.265; In the future when streaming media is becoming a trend, VP8, VP9, VP10, h.266, AV1 and other standards proposed by Google will promote the audio and video industry to a brighter future. The progress of technology has laid the foundation for us to achieve more update effects, and helped to upgrade consumption and improve user experience. If we simply compare h.266 with AV1, we can see that because h.266 introduces more quadtree binary tree structure (qtbt), the overall BD rate is increased by nearly 4%; However, compared with h.265, h.266 reduces the code stream by nearly one time while keeping the image quality unchanged, but the overall coding time increases by about two times, and the overall decoding time increases by nearly one time. Based on previous research, we found that the time complexity of AV1 is 2000 ~ 3000 times that of x.265, which shows that there is a long way to go for the improvement and optimization of AV1 in the future. Q&A Q: Will VP9 be widely popularized in the future? A: I think the biggest problem hindering the widespread implementation of VP9 is compatibility, because many devices now adopt the strategy of downward compatibility, and the penetration rate of VP9 on many devices is not high; If you use VP9 for coding in live broadcasting and other fields, you must transcode before sharing through CDN, and then re convert it to H.264 or h.265 for playback, which is bound to bring a lot of trouble. Q: Can hard coding be implemented on the client? A: Now some high-end mobile phones can realize the hard coded h.265 with stable performance. For example, for IOS, models above iphone7 can be stably realized. However, on the fragmented Android models, the coding performance is greatly affected due to the inconsistent support for various protocols when decoding Android models, and the large performance gap between different grades of mobile phones. The anchor model on hand Taobao is mainly IOS, so we mainly use hard editing scheme on Taobao; For the live broadcast environment, CDN is required for content distribution, and the links mainly supported by CDN basically belong to H.264, and the coding method is limited by the whole live broadcast link environment., Read the full text, original title: hand Amoy h265 codec algorithm and engineering optimization The source of the article: [micro signal: livevideostack, WeChat official account: microwave radio Forum] welcome to add attention! Please indicate the source of the article“

     

     

     

     

    List all Question

    Nickname

    Email

    Questions

    Our other product:

    Professional FM Radio Station Equipment Package

     



     

    Hotel IPTV Solution

     


      Enter email  to get a surprise

      fmuser.org

      es.fmuser.org
      it.fmuser.org
      fr.fmuser.org
      de.fmuser.org
      af.fmuser.org ->Afrikaans
      sq.fmuser.org ->Albanian
      ar.fmuser.org ->Arabic
      hy.fmuser.org ->Armenian
      az.fmuser.org ->Azerbaijani
      eu.fmuser.org ->Basque
      be.fmuser.org ->Belarusian
      bg.fmuser.org ->Bulgarian
      ca.fmuser.org ->Catalan
      zh-CN.fmuser.org ->Chinese (Simplified)
      zh-TW.fmuser.org ->Chinese (Traditional)
      hr.fmuser.org ->Croatian
      cs.fmuser.org ->Czech
      da.fmuser.org ->Danish
      nl.fmuser.org ->Dutch
      et.fmuser.org ->Estonian
      tl.fmuser.org ->Filipino
      fi.fmuser.org ->Finnish
      fr.fmuser.org ->French
      gl.fmuser.org ->Galician
      ka.fmuser.org ->Georgian
      de.fmuser.org ->German
      el.fmuser.org ->Greek
      ht.fmuser.org ->Haitian Creole
      iw.fmuser.org ->Hebrew
      hi.fmuser.org ->Hindi
      hu.fmuser.org ->Hungarian
      is.fmuser.org ->Icelandic
      id.fmuser.org ->Indonesian
      ga.fmuser.org ->Irish
      it.fmuser.org ->Italian
      ja.fmuser.org ->Japanese
      ko.fmuser.org ->Korean
      lv.fmuser.org ->Latvian
      lt.fmuser.org ->Lithuanian
      mk.fmuser.org ->Macedonian
      ms.fmuser.org ->Malay
      mt.fmuser.org ->Maltese
      no.fmuser.org ->Norwegian
      fa.fmuser.org ->Persian
      pl.fmuser.org ->Polish
      pt.fmuser.org ->Portuguese
      ro.fmuser.org ->Romanian
      ru.fmuser.org ->Russian
      sr.fmuser.org ->Serbian
      sk.fmuser.org ->Slovak
      sl.fmuser.org ->Slovenian
      es.fmuser.org ->Spanish
      sw.fmuser.org ->Swahili
      sv.fmuser.org ->Swedish
      th.fmuser.org ->Thai
      tr.fmuser.org ->Turkish
      uk.fmuser.org ->Ukrainian
      ur.fmuser.org ->Urdu
      vi.fmuser.org ->Vietnamese
      cy.fmuser.org ->Welsh
      yi.fmuser.org ->Yiddish

       
  •  

    FMUSER Wirless Transmit Video And Audio More Easier !

  • Contact

    Address:
    No.305 Room HuiLan Building No.273 Huanpu Road Guangzhou China 510620

    E-mail:
    [email protected]

    Tel / WhatApps:
    +8618078869184

  • Categories

  • Newsletter

    FIRST OR FULL NAME

    E-mail

  • paypal solution  Western UnionBank OF China
    E-mail:[email protected]   WhatsApp:+8618078869184   Skype:sky198710021 Chat with me
    Copyright 2006-2020 Powered By www.fmuser.org

    Contact Us