Comparison of mainstream codecs (H.264 AVC, H.265 HEVC, VP8, VP9)

"Mainstream codec (H.264 AVC, H.265 HEVC, VP8, VP9) Compare Be Be This switched: http: //houh-1984.blog.163.com/blog/static/31127834201321995354105/ Outline H.264 (MPEG 4, class 10) is embedded in the mobile device and the most used algorithm standard video codec. Currently more than 50 companies to provide H.264-related products (H.264 hardware / software building blocks). In recent years Google Google open source WebM video coder to build Android-based mobile and consumer video platform. There are other open-source video codec standards such as Dirac, FFMpeg and Theora, but few applications on the market. Some leaders in the market such as Allegro, Broadcom, Samsung, and ViXS, have already begun to develop H.265High Efficiency Video Coder (HVEC) of. There are a number of discussion forums on Embedded H.264 / H.265 encoding blog article or white paper decoder design: Making mobile video apps more energy efficient A tutorial on the H.264 scalable video codec Introduction to video transcoding for consumer electronics A low power implementation of the H.264 codec for consumer apps Trade-offs with H.264 and other video codecs Wireless HDMI with a low latency, lossless H.264 video codec Zero latency time-critical video encode / decode with H.264 Codec designfor nextgen Internet Video Using the H.264 spec to do anywhere, anytime placeshifted video, Apple's continued support for the standards Although H.264 / H.265 video codec is still the first choice, but because of patent issues, Google's open-source WebM codec is also embedded platform has been a great application. Google's chrome priority processor also integrates standard codec VP8, VP9 of. H.264 / MPEG-4 AVC Outline H.264 / MPEG-4 Part 10, also known as AVC (Advanced Video Coding, Advanced Video Coding), is a video compression standard, high-precision video recording of a widely used compression format and distribution. The final draft of the first edition of the standard was completed in May 2003. H.264 / MPEG-4 AVC is a block-based motion compensation for codec standard. I.e. Moving Picture Experts Group (the MPEG) - - coalition of the Joint Video Team (JVT, Joint Video Team) developed by the ITU-T Video Coding Experts Group and the ISO / IEC Joint Working Group. Due to the standard ITU-T H.264 and ISO / IEC MPEG-4 AVC standard (formally known 14496-10 ISO / IEC - MPEG-4 Part X, Advanced Video Coding) have the same technical content, it was a common management. Because H.264 is a codec standard known for Blu-ray discs, all Blu-ray Disc (blue-ray) players must be able to decode H.264. It is also widely used in Internet streaming data such as Vimeo, YouTube, and Apple's iTunes Store, web software such as Adobe Flash Player and Microsoft Silverlight, as well as a variety of high-definition television terrestrial broadcasting (ATSC, ISDB-T, DVB-T or DVB-T2), cable (DVB-C) and satellite (DVB-S and DVB-S2). 1. H.264 coding flowchart of FIG. technical details H.264 / AVC contains a number of new features, such that it is compared to the previous codecs not only more efficient encoding can be used in applications in various network environments. These new features include: Multiple reference frame motion compensation. Compared to the previous video coding standards, H.264 / AVC in a more flexible way to use more encoded frame as a reference frame. In some cases, it may be used up to 32 reference frames (which in the previous standards, not a number of reference frames is two for the B frames). The most characteristic scenes can bring certain serial bit rate or reduce the quality increase, for certain types of serial scenes, for example the case of rapidly repeated flashing, repeated shearing occlusion or background, it is significant the lower bit rate encoding. Variable block size motion compensation. 16x16 block may be used to the maximum to a minimum of 4x4 motion estimation and motion compensation can be performed more accurately on the moving partition area of the image in serial. These types share 16 × 16,16 × 8,8 × 16,8 × 8,8 × 4,4 × 8,4 × 4. In order to reduce aliasing (Aliasing) and get a sharper image, using a six-tap filter (six-order digital filter) to generate a luminance value of one-half the weight of the prediction pixel. Structure allows the macroblocks of 16x16 field mode macroblock (MPEG-2 with respect to the 16x8). 1/4 pixel accuracy motion compensated prediction block can provide more accurate motion, since the chrominance luminance samples typically 1/2 (see 4: 2: 0), then the accuracy of the motion compensation reached 1/8 pixel accuracy. The weighted motion prediction, means may be used to increase the weight and displacement when the motion compensation approach. It can in some special occasions, such as fade in, fade in, fade out and then fade in the case to provide considerable coding gain. Use of a blocking filter in addition to the Loop, the blocking effect can be reduced based on other common discrete cosine transform (DCT) video codec. A matching 4x4 integer transform (similar to the design of discrete cosine transform), also in the expansion of high precision, the use of 8x8 integer transform, and can be adaptively selected in 4x4 and 8x8 transform transformation. After the first 4x4 transform, the DC coefficient (DC coefficient and a particular situation chromaticity luminance DC coefficient) to conduct a Hadamard transform, so that better compression results in smooth areas. Using the boundary pixels of the neighboring block Intra spatial prediction (MPEG-2 ratio was in the video portion of the DC prediction coefficients used in transform coefficients and H.263 +, and the use of MPEG-4 video portion is better prediction). Two yuan context arithmetic encoding (CABAC) based on its flexibility of various syntax elements, more efficient lossless entropy coding is known in the case of the corresponding context probability distribution. Context variable length coding (CAVLC) based on a change in the quantized coefficients are encoded. CABAC is relatively low compared to its complexity, the compression ratio is not high, but compared to previous entropy encoding scheme used by video coding standards, it is quite effective. Neither on nor is CAVLC syntax elements using exponential Golomb code (Exponential-Golomb, Exp-Golomb) entropy encoding scheme, encoded with CABAC. Using a network abstraction layer (NAL), so that the same video syntax can be applied to a variety of network environments; parameter set and uses a serial (SPSs) and picture parameter sets (PPSs) to provide higher robustness (robustness is ) and flexibility. Switching strip (Switching slices, comprising two kinds of SI and SP), which allow the encoder to the instruction decoder to jump to a video stream is being processed, to solve the video stream and the switching rate "" trick modes "" (Trick mode) operation. When the decoder using the SP / SI slice jump to the middle of a video stream, unless after the decoded frame image before the switching reference frame as a reference frame, which can be decoded to obtain exactly the same reconstructed image. Flexible macroblock arrangement pattern (FMO for Flexible macroblock ordering, also called strip group techniques slice groups), and any arrangement of the strip (ASO for arbitrary slice ordering) mode, is used to change the basic image coding unit - macroblock the coding sequence. It is possible to have improved strength and toughness (robustness is) about the channel stream channel and some other objects. Data partitioning (DP for Data partitioning), can be different importance transfer syntax elements separately packaged, and using non-equal data protection (UEP for unequal error protection) techniques to improve the video stream against channel error / loss strong toughness (robustness-). redundant stripe (RS for redundant Slices), also a stream to improve the robustness of the technique. Another coding technique which utilizes a region of an image can be transmitted (or all) represents a (usually lower resolution encoded code stream) that represents the time when the main error or loss can be represented by the second redundant coding to decode. Using an automated method of packaging the byte stream, the stream appears to avoid the start code is repeated codewords. Is the start code the bitstream for random access and reconstruction of the code word synchronization. Supplemental Enhancement Information (SEI for Supplemental Enhancement Information) and video available information (VUI for Video Usability Information) increased the way information is added to the video stream, provided the use of a variety of applications. The auxiliary layer (Auxiliary pictures), can be used to achieve special functions, such as a composite alpha (alpha compositing). Frame number, creates a support function using the sub-video serial serial, can be used to support the implement temporal scalability, also supports the entire lost frame image detection and concealment (loss may be due to network packet loss or the channel error caused). Picture order count, using this feature makes the pixel value with the time information and the decoded image sequence of each frame image is independent, i.e. the use of a single time information transmission system, control change, which does not affect the pixel values of the decoded image. These techniques, in combination with other techniques, such that compared to the previous H.264 video codec can bring significant improvement in performance, and to achieve more widely used in a variety of environments. In the H.264 compression performance compared to MPEG-2 has a greatly improved, at the same image quality can be reduced to half rate or less. MPEG and other video standards, like, H.264 / AVC also provides a reference software JM (http://iphome.hhi.de/suehring/tml/) as well as open source quickly realize x264 (http: //www.videolan .org / x264.html), and can be downloaded for free. Its main purpose is to provide a demonstration platform demo H.264 / AVC various functions, rather than as a direct application platform. MPEG is currently also carrying out a number of hardware reference design. Applications and Implementation Apple has integrated into the H.264 version of Mac OS X v10.4 (nickname Tiger), and announced support for H.264, QuickTime version 7.0 in May 2005. The third generation mobile communication Cooperation Organization (3GPP) has been approved H.264 in the sixth release / AVC as its mobile phone service standard optional multimedia technology. Moving Picture Standards Association (MISB for The Motion Imagery Standards Board) under the US Department of Defense has accepted the H.264 / AVC video codec for its recommended core applications. Internet Engineering Task Force (IETF) has completed a packaged approach to load packaging format (RFC 3984) as the transmission H.264 / AVC stream in real-time transport protocol (RTP). Internet Streaming Media Association (ISMA for Internet Streaming Media Alliance) has accepted the H.264 / AVC as its ISMA 2.0 of the technical specifications. The tissue MPEG H.264 / AVC completely integrated into its system protocol (e.g. MPEG-2 and MPEG-4 Systems) and ISO media format protocol. International Telecommunication Union ITU-T standards group has adopted the H.264 / AVC system as part of its specification H.32x series multimedia telephone system. ITU-T adoption, making the H264 / AVC has been widely used in the video conference system, a video phone and access to two major product providers (Polycom and Tandberg support virtually all new videoconferencing products support H.264 / AVC. H.264 will likely be a variety of video-on-demand services (Video-On-Demand, VOD) used to provide movies and TV shows over the Internet directly to a PC on-demand services. H.264 / MPEG-4 AVC SVC Scalable Video Coding(Scalable Video Coding, SVC) is an improvement in traditional H.264 / MPEG-4 AVC encoding, which can improve greater encoding flexibility and have time scalability, spatial scalability and signal-to-noise ratio. SnR scalability, three major features, make video transmission more adapt to heterogeneous network bandwidth. The SVC's goal is that standardization has made the encoded high-quality video code stream containing one or more sub-bitstreams to decode, which can use a complex and rebuilt quality to achieve similar use of existing H.264 / MPEG-4 AVC is designed with a subset in the same amount of data code stream. Subset BitStream can represent a lower space or a low-resolution video signal (each individual or combination). Time (frame rate) scalability: Motion compensation, the complete picture of the dependence (ie, their associated packets) can be discarded from BitStream. (Time Scalability H.264 / MPEG-4 AVC. SVC also enhances information only to provide reference to improve its usage.) Scalability of space (picture size): Video encoding is in multiple spaces Resolution. The decoded data and low resolutions can be used to predict data or samples, higher resolutions to reduce Bitrate higher resolution. Signal to noise ratio / Qida Scalability: Encoding video is in a single spatial resolution, but in different qualities (Qualities). The decoded data and sample quality are low, which can be used to predict the high quality of data or samples, reduce Bitrate to achieve higher quality (Qualities). Combined Scalability: Combined with the above three scalability. technical details Scalable Video Coding includes several scalable configurations: Scalable Baseline, Scalable High, Scalable High Intra, Scalable Constrained Baseline and Scalable Constrained High Profile. These configurations combine the H.264 / MPEG-4 AVC configuration of the reference layer and for Scalable extension tool: Scalable Baseline Profile: Conversation, Mobile, and Monitoring Applications Basic layer is H.264 / MPEG-4 AVC Baseline Profile Support B SLICES, Weighted Prediction Weighted Prediction, Cabac Entropy Coding, 8 × 8 brightness transformation of the enhancement layer, and the basic layer does not have these enhancements; spatial scalable is limited to level And the vertical resolution ratio is between 1.5 and 2; there is no limit to the quality and time scalable; Quality and Temport Any Restriction. Scalable High Profile: Application for broadcasting, streaming media, storage, and video conferencing. Basic layer is H.264 / MPEG-4 AVC High Profile supports all tools in Scalable Video Coding extensions. Spatial resolution, quality and time telescopic do not limit Scalable High Intra Profile: Professional applications Only instantaneous decoded refresh images Instantaneous Decoder Refresh (IDR). The IDR image does not use the previous image as a reference. The basic layer is only the mode of the IDR image of HIGH PROFILE; the spatial resolution, quality and time telescopic are not limited, but only the IDR image is meaningful; Scalable Constrained Baseline Profile Scalable Constrained High Profile H.265 High Efficiency Video Coding H.265 is a video compression format of the high compression ratio developed after ITU-T VCEG followed. H.265 Video Format Standard was officially announced by the International Telecommunication Union (ITU) on January 25, 2013, with the highest resolution up to 8192 × 4320. NGVC wants to reduce the bit rate by 50%, while the main image quality and computational complexity are compared with H.264, the computational complexity is increased from three times. The HEVC faces the next generation of HDTV design, features such as plane scanned, support sample rate to 4320P (8192 × 4320), enhanced dynamic range adjustment and noise suppression. Figure 2. H.265 encoding flow chart Technical characteristics Two-dimensional unsubable adaptive interpolation filter Separable AIF orientation AIF no longer uses motion compensation with 1/8-PEL motion vector Supermacroblock structure to 64x64 conversion (H.264 only 32x32) Adaptive prediction error coding organization (APEC) Adaptive Quantization Matrix Selection (AQMS) Sports Vector Selecting and Coding Competition Method for Individual Coded Module Dependently KLT Predictive block size HEVC will use the macroblocks defined in the previous standard with a maximum to 64x64 pixels and can be further subdivided into a variable size block. HEVC becomes the coding tree unit (CTUS) into the coding block of brightness and chroma (CTBS). A CTB can be 64x64, 32x32 or 16x16. This is the size of the PREDICTION UNITS, PU, and the size of the PRETITS, PU is sized from 64x64 to 4x4, but only for two-way predictions. To 8x4 to 4x8 size. The conversion block size of the predicted residual encoding can be 32x32, 16x16, 8x8, 4x4. Internal color Internal color depth increase (IBDI) allows the encoder to operate in a higher internal state. IBDI can do a maximum of 14-bit bit wide. Parallel Processing Tools Images can be divided into independently decoded rectangular blocks and strips, that is, the concept of strips and Tile porcelain. Most of the stripes can be decoded separately, but ultimately need to synchronize into a video stream. The strip can be encoded as a strip without prediction, independent of each other. Of course, the ribbon may still require loop filtering. Entropy Coding The HEVC is similar to the context-adaptive binary archmetic code (Cabac), and H.264 is similar. Only HEVC only supports Cabac encoding. Intra prediction HEVC's intra prediction has 33 direction modes, while only 8 in H.264, HEVC also specifies Planar and DC intra prediction mode. Motion Compensation The HEVC uses a half-pixel or 1/4 pixel precision motion compensation, and a 7 tap or 8 tap filter. H.264 uses half a pixel precision and 6 tap filters. For 4: 2: 0 video, the chromaticity component has 1/8 pixel precision and 4 tap filters. Weighted prediction in HEVC can be a one-way or two-way forecasts. Sports Vector Prediction Motion Vector PREDIX HEVC defines the level of 16-bit and vertical motion vectors, supporting the range to [-32768, 32767], up to -8192 to 8191.75 brightness pixels, H.264 only supports -512 to 511.75 pixel points. HEVC's MV mode has advanced motion vector prediction (AMVP)) and merge mode. The merge mode is running from the neighbor block inheriting the MV vector value, thereby there is SKIP and Direct mode. Inverse Transforms The transformation block size of the predicted residual encoding in HEVC can be 32x32, 16x16, 8x8, 4x4. A CTB can be recursively divided into 4 or more TUs. TU will use basic transform DCT (Discrete Cosine Transform), and the residual of the 4x4 intra prediction brightness block is adopted from integer transformations derived from DST (Discrete Sine Transform). This is reduced by 1% of the codec with the original 4x4 brightness. The chroma block is used in the same TU size as the brightness block. Loop filter HEVC has two loop filters, decapsular filters (DBF, DEBLOCIKING FITER) and sample adaptive offset (SAO, SAMPLE Adaptive Offset filter (DBF). Similar to the DEBLOCKING filter and H.264 / MPEG-4 AVC, the DBF in the HEVC can only be used for blocks of 8x8 (improved parallel processing performance), while H.264 is suitable for 4x4 blocks. The intensity of DBF in HEVC is from 0 to 2. It is perpendicular to the vertical boundary, and vertical filtering is perpendicular to the horizontal boundary. After the SAO filter is after the DBF filter, for better reconstruction of the original image. Each CTB's SAO filter can enable or disable the boundary offset mode or sub-offset mode. Unlock filter DBF uses H.264 / MPEG-4 AVC similar design, better support and distribution processing is similar. The DBF in HEVC is only suitable for a 8 × 8 sampling mesh, and a 4 × 4 sampling grid for DBF with H.264 / MPEG-4 AVC. DBF uses an 8 × 8 sampling grid because it causes no obvious degradation and significantly increases concurrent processes because DBF no longer leads to the interaction between cascades and other operations. Another change is that hevc only allows three DBFs that are 0? 2. The DBF that HEVC also needs to be applied to the horizontal filtering of the vertical edge of the screen and only the vertical filtering of the horizontal edge is applied, which allows for multiple concurrent threads of DBF. Sample adaptive offset Use the SAO filter after DBF and use offset to produce a better reconstruction of the original signal. There are two modes: edge offset mode or with offset mode. In the edge offset mode, through the value of the comparison, according to the two neighbors, the sample is divided into five categories: the smallest, two edges, the maximum, or both, for each first four categories Apply an offset. The pattern that can be offset can be classified into 32 frequency bands and select four consecutive band transfer offset. The SAO filter is designed to improve image quality and reduce the oscillation effect. Coding efficiency Coding EffICIENCY Figure 1 Comparison Video Coding Standard Average Code Rate under Various Video Standards of PSNR Decrease H.264 / MPEG-4 AVC HP MPEG-4 ASP H.263 HLP H.262 / MPEG-2 MP HEVC MP 35.4% 63.7% 65.1 % 70.8% HP - 44.5% 46.6% 55.4% MPEG-4 ASP - - 3.9% 19.7% H.263 HLP - - - - 16.2% Video coding efficiency is typically an objective evaluation index with Peak Signal-to-Noise Ratio (PSNR). HEVC benefits from a larger Coding Tree Block (CTB) size. The HM-8.0 HEVC video resolution is 2560 × 1600, compared with the 64 × 64 CTB size, if the 32 × 32 CTB size is used, the code rate is increased by 5.7%, and if the 16 × 16 CTB size is used, the code rate is increased by 28.2%. Moreover, the larger the resolution, the greater the size of the CTB size, the more the decoding time is also reduced. The table is HEVC Main Profile (MP) and H.264 / MPEG-4 AVC High Profile (HP), MPEG-4 Advanced Simple Profile (ASP), H.263 High Latency PROFILE (HLP), and H.262 / MPEG -2 Main profile (MP) encoding efficiency comparison. The test sequence includes 5 HD resolutions and 4 WVGA (800 × 480) resolutions. The results of the main view test showed that the HEVC MP was reduced by 49.3% by the HP Code rate of H.264 / MPEG-4 AVC HP. VP8 VP8 is an open image compression format, which was initially developed by ON2 Technologiesis, which was then released by Google. At the same time, Google also released the real library of VP8 encoding: libvpx, authorized by BSDThe way of clause is released, and patented use rights are also added. After some debate, the final VP8 authorization confirmed as an open source authorization. The development of VP8 coding starts from September 13, 2008, with the aim of replacing the old VP7 encoding format. Google acquired On2 in 2010, the people called on Google to release the source code of VP8. On March 12, 2010, the Free Software Foundation issued an open letter, I hope that Google can gradually be gradually replaced by HTML5 and open VP8, replace YouTube Adobe Flash Player and H.264 currently used. On May 19, 2010, Google was in the Google I / O year meeting, with the release of VP8 encoding software in the BSD authorization, VP8's bitstream format is released with irrevocable free patent use rights. VP8 also became the second ON2 Technologies encoded product released in an open source, the previous one is VP3 donated to the XIPH.org Foundation, followed by an image encoding format THEORA. Encoded At present, VP8 can only be encoded by libvpx, while Google hires FFMPEG's developer Ronald Bultje to develop a VP8 encoder based on X264 architecture, called XVP8, which will be integrated in X264 after publishing. The Finnish WebM Hardware Development Team is a VP8 hardware encoder for the Register Transfer Level, which is available free of charge to semiconductor manufacturers. Decode Libvpx can decode the image of VP8, on July 23, 2010, FFmpeg's developer Jason Garrett-Glaser, Ronald Bultje, and David Conrad released a VP8 decoder named FFVP8, the test results show ffvp8 than Google's own libvpx decoder More performance. In addition, the hardware team of the webm project also has a hardware decoder that publisted the Transfer Level, which is also free. WEBM Webm dissertations and VP8 were published on May 19, 2010, Mozilla, Opera, Google and other more than 40 manufacturers jointly assist in development, with the aim of making VP8 into the image format of HTML5. WEBM is a container format, and the image part uses VP8, and the sound format is VORBIS. Internet Explorer 9 can support WebM images by installing the decoder, and the operation of operating system android is a 2.3 version (GingerBread) supporting WebM Adobe also announced playback of VP8 images in future Flash Player. WEBP On September 30, 2010, Google released the webp, which is based on the VP8 encoded image file format. The purpose is to replace the existing JPEG. As the transmission of the web picture, the container format used is Resource Interchange File Format (Riff " ). Comparison of H.264 H.264 is currently using the most network image encoding format, so it is most often compared to VP8. The coding technology of H.264 includes patents (authorized by MPEG-LA), and the use of authorizations need to be used on hardware, and VP8 does not need. Even if there is a Google's back book, VP8 is still difficult to avoid all patents, which may be in the same way with VC-1. Managing the MPEG LA of the H.264 patent pool claims that there are 12 companies hold Google VP8 patents. US MPEG LA said: "Related Preparations for Creating a VP8 Patent Pool" ". According to MSU Graphics & Media Lab in May 2011, VP8 requires approximately 213% of the amount of data to reach the same image quality as H.264. One of the developers of x264: Jason Garrett-Glaser gave some comments for VP8, he believes that VP8 does not currently realizes a true bitstream specification, and there is a lack of technology in some coding technology. The Subblock prediction of the VP8 is consistent with the 4 × 4 mode of H.264, but the VP8 does not support the 8 × 8 mode of H.264 High Profile, which will affect the detail. VP8 lacks B Frame is another big problem. H.264 has a large number of hardware support, but VP8 still needs to rely on software decoding. The VP8 standard document is too simple. Most of them are directly sticked to C Language Code, rather than text, and many details are not clear. VP9 VP9 is the open source of Google, which is a subsequent version of VP8. When the initial development is named next-generation open source video or VP-NEXT. VP9 development begins in 2011 Q3, attempt to reduce the 50% code rate of VP8 Maintain the same quality, and also hope that VP9 is better encoded than Hightiency Video Coding. At the end of 2012, the VP9 decoder was added to the Chrome browser, issued a formal version of the Chrome browser in February 2013. The VP9 supports the size of the super macroblock to 32x32, which also uses the macroblock decomposition structure of the quadruple. REFERENCE: http://houh-1984.blog.163.com Http://www.embedded.com/electronics-blogs/cole-bin/4407676/who-l-win-the-consumer-video-codec-battles-?cid=newsletter+- +Embedded.com +Tech +FOCUS Http://en.wikipedia.org/wiki/high_efficiency_video_coding http://en.wikipedia.org/wiki/h.264/mpeg-4_avc Http://en.wikipedia.org/wiki/h.264/mpeg-4_avc_products_and_implementations http://en.wikipedia.org/wiki/scalable_video_coding http://en.wikipedia.org/wiki/vp8 http://en.wikipedia.org/wiki/vp9 This article describes the current VP8 and VP9 codecs, H.265 HEVC, and Google, which are currently used in the consumer electronics market, analyzed their technical features, coded efficiency, and application areas. "