"Introduction
The prospect of digital video industry based on Internet is promising, and the scale deployment of 3G will also promote mobile video communication to become a reality. However, the digitized video image has massive data, which makes it difficult to store and transmit the image. Digital video industry refers to the cultural creativity and communication industry dominated by digital video form in digital content, as well as the multi-disciplinary high-tech technology support and guarantee service industry that must be relied on. Therefore, based on the spatial correlation between adjacent pixels and adjacent lines of intra frame images and the temporal correlation of moving images between adjacent frames, the expert group adopts compression coding technology to discard those things and redundant components that are not important to human visual images and human auditory sounds, so as to reduce the amount of data stored, transmitted and processed and improve the utilization of spectrum resources, Making digitization a reality. Digital video compression coding technology is the key technology to solve this problem. H. With its good network adaptability, high coding and compression efficiency and flexible syntax configuration, 264 is more suitable for the development direction of video processing and objects in different application environments than previous video coding standards in the field of video processing. The compression efficiency and image playback quality of the coding algorithm are further improved. When the subjective perception of the naked eye is the same, the coding efficiency of H.264 is about 50% higher than that of H.263.
Using high-performance digital signal processor (DSP) to realize H.264 real-time encoder is a fast and effective method, which not only contributes to the rapid promotion and application of H.264 video standard, but also points out the latest research direction in the field of video image compression. Digital signal processing (DSP) is a new subject involving many disciplines and widely used in many fields. Since the 1960s, with the rapid development of computer and information technology, digital signal processing technology came into being and developed rapidly. Digital signal processing is a method of processing real signals represented by digital sequences by performing conversion or extracting information using mathematical techniques. In the past two decades, digital signal processing has been widely used in communication and other fields. DSP (digital signal processor) is a unique microprocessor, which processes a large amount of information with digital signals. Its working principle is to receive analog signals and convert them into digital signals of 0 or 1. Then modify, delete and strengthen the digital signal, and interpret the digital data back to analog data or actual environment format in other system chips.
1. Key technologies of H.264 coding
1.1 motion vector estimation and compensation based on flexible segmentation macroblock (MB) and increasing the compression effect of transform
H. According to the coding characteristics of macroblocks, the method of combining luminance block DC conversion, chroma block DC conversion and ordinary difference conversion is adopted. In motion estimation, h.264 source coding is based on 4 & times; The integer transformation of 4 blocks can flexibly select the size of the block. The pixel block size of other standard processing is 16 & times; 16 or 8x8.h.264 uses variable size blocks to adapt to different application environments and requirements, and adopts 16 & times; 16,16&TImes; 8,8 × 16,8 × 8. Four modes; When divided into 8 × 8 mode, 8 can be further used × 4,4 × 8,4 × 4. The three sub macroblock division modes are further divided, as shown in Figure 1. Macroblocks of different sizes are used as required. Integer transformation can not only make the division of moving objects more accurate, but also can not reduce the connection error of moving object edges. When more motion details are needed, that is, the introduction of smaller motion compensation blocks can improve the prediction quality in general and special cases, and it can improve the subjective visual effect, At the same time, the amount of calculation in the transformation process is reduced. Experiments show that the application of seven blocks of different sizes and shapes can be compared with 16 × The coding of 16 blocks improves the compression rate by more than 15%.
1.2 motion estimation with L / 4 pixel or L / 8 pixel accuracy
Motion estimation and compensation algorithm is the most critical part of video compression technology, which affects the coding speed, quality and bit rate, and its coding complexity is also the highest in the whole coding system.
In H.264, the predicted value of 1 / 2 pixel position is obtained by interpolation of 6-order FIR filter. Fir (finite impulse response) filter: finite length unit impulse response filter is the most basic element in digital signal processing system. It can ensure arbitrary amplitude frequency characteristics and strict linear phase frequency characteristics. At the same time, its unit sampling response is finite, so the filter is a stable system. Therefore, FIR filter is widely used in communication, image processing, pattern recognition and other fields. When the 1 / 2 pixel value is obtained, the value of the L / 4 pixel position is obtained by taking the mean value of the integer pixel position and the 1 / 2 pixel position pixel value. In the case of high bit rate, motion estimation with 1 / 8 pixel accuracy is provided. Using high-precision motion estimation will further reduce the inter prediction error, reduce the number of non-zero bits after transformation and quantization, and improve the coding efficiency. Using 1 / 4 pixel spatial accuracy can improve the coding efficiency by 20% compared with the original one pixel accuracy (integer accuracy).
1.3 multi reference frame prediction
Reference frame is the basis of inter prediction coding, that is, motion compensation. According to the position relationship between it and the frame to be predicted, it can be divided into forward reference frame and backward reference frame.
In the previous encoding and decoding technology, when inter predicting the P-frame image, it is only allowed to encode with reference to the previous frame image, that is, the previous I image or P image is the reference frame. When predicting the B image, it is only allowed to encode with reference to the front and rear frame images, that is, the front and rear I images or P images are the reference images. H. 264 breaks these restrictions and allows to select a frame from the first few frames of the current frame as the reference frame image to predict the motion of the macroblock. When the multi reference frame mode is selected. The encoder selects the best reference frame from several reference frames. The encoder is a device that compiles and converts signals (such as bit stream) or data into signal forms that can be used for communication, transmission and storage. The encoder converts the angular displacement or linear displacement into electrical signals. The former becomes the code disk and the latter is called the code ruler. According to the readout mode, the encoder can be divided into contact type and non-contact type. The contact type adopts brush output. One brush contacts the conductive area or insulating area to indicate whether the status of the code is "" 1 "" or "" 0 ""; The non-contact receiving sensitive element is a photosensitive element or a magnetic sensitive element. When the photosensitive element is used, the light transmission area and light tight area are used to indicate whether the code status is "1" or "0". The collected physical signals are converted into machine code readable electrical signals through the binary coding of "1" and "0" for communication, transmission and storage. To achieve the best prediction effect, the reference frame image can even be an image using bidirectional prediction coding, which greatly reduces the prediction error. another
Therefore, multi reference frame prediction can provide better prediction effect for periodic motion and background switching.
1.4 adaptive filter for eliminating block efficiency
The transform coding algorithm based on block processing ignores the continuity of object edge, and is prone to block effect at low bit rate. In order to eliminate the block effect introduced in the process of prediction and transformation, H.264 adopts an adaptive filter to smooth the macroblock edge and effectively improve the subjective quality of the image. However, different from the previous standards, the block effect elimination filter of H.264 is located inside the motion estimation cycle. The image after block effect elimination can be used to predict the motion of other images, that is, the filtered macroblock is used for motion estimation to generate smaller frame difference for coding and further improve the prediction accuracy.
1.5 enhanced entropy coding
The previous standard entropy coding adopts variable length Huffman coding with unified code table, which can not adapt to the transformation of multi terminal video content and affect the coding efficiency. According to different video contents, H.264 uses shorter codewords to represent high-frequency symbols, which can further remove the redundancy in the code stream and provide two kinds of entropy coding, namely context adaptive binary arithmetic coding (CABAC) and content-based adaptive variable length coding (CAVLC). CABAC has higher coding efficiency and complexity. Under the same image quality, Using CABAC to encode TV signals can reduce the bit rate by about 10% (10% ~ 15%), and the latter has strong anti error ability.
2 DSP platform implementation of video codec based on H.264
In digital image processing, a lot of digital signal processing work needs to be completed, especially for the new generation video compression coding standard such as H.264. In terms of its baseline, its decoding complexity is twice that of H.263 in the same case, and the coding complexity is three times that of H.263. To solve this problem of high computing capacity, it largely depends on high-speed DSP technology, and the DSP processor produced by semiconductor manufacturing process can have lower power consumption.
Dm64x series chips produced by TI company have ultra-high dominant frequency, strong parallel processing ability and signal processing function. It is an ideal platform for H.264 coding and decoding.
The 642 series produced by TI company is a special DSP for multimedia applications. The DSP has a clock frequency of 600 MHz, 8 parallel operation units and a processing capacity of 4800 MIPs. It adds many peripherals and interfaces on the basis of C64x. It can be seen that DM642 is a powerful multimedia processor and a good platform for multimedia communication system.
The system is mainly to collect the analog video image (PAL system), then compress it, and then send the compressed data to the receiving end through spread spectrum. After receiving the code stream at the receiving end, DSP decompresses it, and then DSP is responsible for the display and storage of the image. Therefore, the overall design scheme must include video input / output, network and other interfaces. The design drawing is shown in Figure 2.
At the transmitting end, the video output is first converted into a digital video signal by the video A / D chip, and then input to the video port 2 of the DM642. The DM642 collects the image and sends the image data to the SDRAM. At the same time, the DM642 compresses the video image in real time, and sends the compressed data to the channel coding part through the McBSP to complete the work of the transmitting end. At the receiving end, the compressed image data sent by the channel decoding part is received, and then the DM642 completes the real-time decompression of the image, and sends the decompressed data to the SDRAM, and then sends the decompressed image data to the video port 0, and then the video port 0 sends the data to the video D / A to complete the real-time display of the video. As an extension of the audio / video interface in Figure 2, the 10 / 100MB / s Ethernet card and USB controller peripherals are mainly to facilitate the receiver to directly transmit the digital video signal to the computer or terminal, and the power supply and reset circuit completes the power supply and reset function of the circuit board.
3 DSP optimization of video codec based on H.264
The H.264 encoder is transplanted to the DM642 image processing platform. Because the core algorithm of H.264 needs to be improved not only in the code structure, but also in the specific core algorithm, the coding speed of the whole system is very unsatisfactory and can not meet the requirements of real-time application. Therefore, the system needs to be optimized from all aspects, Reduce the coding time. Firstly, the redundant code in the encoder is removed, and then the optimization work is divided into three steps: implement and optimize the H.264 algorithm on PC; The DSP based H.264 code of PC can realize the H.264 encoding and decoding algorithm on DSP. However, the operation efficiency of the algorithm is very low. Because all the codes are written in C language and do not make full use of the various performance of DSP, it must be further optimized in combination with the characteristics of DSP in order to realize the real-time processing of video images by H.264 video decoder algorithm, That is to optimize the DSP algorithm of H.264. The optimization of DSP code is divided into three levels: project level optimization, C program level optimization and assembler level optimization.
4 Conclusion
In the above environment, the decoder algorithm can achieve the decoding speed of 45 ~ 60 f / s for QCIF test sequence, and achieve the purpose of real-time decoding. The test results show that the subjective quality of the image is good, there is no obvious block effect, and the bit rate is relatively low. In addition, the real-time performance of image coding is related to the content of image and the intensity of motion. The H.264 video codec implemented on DM642 board has the characteristics of strong function and flexible use, and has a wide application prospect. The key of radvision strategy is to support h.264-svc technology in scopia architecture, so that the scopia desktop system supporting SVC of radvision can interoperate with other standard video coding and decoding technologies such as H.263 and H.264. Different devices can be connected to the same conference room, and each device can obtain the best video codec quality according to its performance. Devices that do not support h.264-svc can still be connected through H.263 or H.264. It is believed that in the near future, videophone, video conference, cable TV, wireless streaming media communication and other products based on H.264 algorithm and DSP processor will gradually enter thousands of households, and the application of video codec on embedded processing terminals will gradually become the mainstream of applications., Technology Zone
Tech supports Amazon (AWS) media services to provide quality assurance for end-to-end video
IMEC is about to show its first short wave infrared (SWIR) band hyperspectral imaging camera
4K Ultra HD home theater projector brings HD experience to participate in the grand event
Design of video display system based on Unified Computing Architecture Technology
Apple TV 4K disassembly report: familiar modular components“
Our other product: