Android mobile phone live broadcast (1) overview

3. collection
Acquisition mainly includes two aspects: video acquisition and audio acquisition. The video is collected by the camera, which involves the relevant operation of the camera and the parameter setting of the camera. Due to the differences in the cameras of various mobile phone manufacturers, there are some pitfalls in this regard, which will be described in the article about the camera. Audio is collected through a microphone. The microphones of different mobile phones support different audio sampling rates, and sometimes the audio needs to be echo canceled in order to support the microphone function.

Key points of video capture technology:

Check whether the camera can be used;
The image captured by the camera is horizontal, and the captured image needs to be rotated to a certain extent before being displayed;
There are a series of image sizes to choose from when the camera captures. When the captured image size is inconsistent with the screen size of the mobile phone, special processing is required;
The Android phone camera has a series of states, and the corresponding operation of the camera needs to be in the correct state;
Many parameters of the Android phone camera have compatibility issues, and these compatibility issues need to be better dealt with.

Key points of audio capture technology:

Check whether the microphone can be used;
Need to detect the mobile phone's support for a certain audio sampling rate;
In some cases, it is necessary to perform echo cancellation processing on the audio;
Set the correct buffer size during audio capture.
Note: There will be a special article about the collection later

4. processing

Video processing

Beauty is now almost a standard configuration of mobile phone live broadcast software. After beautification, the host has a higher appearance and is more attractive to fans. There are also some Android live broadcast applications that can recognize the host’s face and add fun animations. Special effects, sometimes we also need to add a watermark to the video.

In fact, beautifying video and adding special effects are processed through OpenGL. There is GLSurfaceView in Android, which is similar to SurfaceView, but it can be rendered with Renderer. Texture can be generated through OpenGL, SurfaceTexture can be generated through the texture Id, and SurfaceTexture can be handed over to Camera, and finally the camera preview screen and OpenGL are connected through the texture, so that a series of operations can be performed through OpenGL.

The whole process of beautification is nothing more than generating a new texture through the FBO technology in OpenGL based on the texture previewed by the Camera, and then using the new texture to draw on the onDrawFrame() in the Renderer. Adding a watermark is to first convert a picture into a texture, and then use OpenGL for drawing. Adding dynamic pendant special effects is more complicated. First, it is necessary to perform algorithmic analysis to identify the corresponding parts of the human face based on the current preview image, and then draw corresponding images on each corresponding part. The realization of the whole process is somewhat difficult.

The following figure is a flowchart of the entire beauty process:

Beauty process

The picture below shows the beauty and animation effects very well.
Beauty

Animation effects and watermarks

Note: There will be a special article about OpenGL and the realization of the whole process.

Audio processing

In some cases, the host needs to add some additional sounds to increase the live broadcast atmosphere, such as applause and so on. One way to deal with it is to play the additional sound directly, so that the microphone will collect it and then record it together, but this kind of processing will not work when the anchor wears headphones or needs to perform echo cancellation processing on the sound. Since the corresponding function has not been added to our project, there is no relevant experience to share for the time being, we may add this function later, and then share it with you.

5. coding
Through the camera and microphone, we can collect the corresponding video and audio data, but these are raw data in a fixed format. Generally speaking, the camera collects one frame by frame, and the microphone collects PCM audio data. If these data are sent directly, the amount of data is often very large, resulting in a large waste of bandwidth, so it is often necessary to encode video and audio before sending.

Video encoding
1. Predictive coding
As we all know, an image is composed of many so-called pixels. A large number of statistics show that there is a strong correlation between pixels in the same image. The shorter the distance between two pixels, the stronger the correlation. In layman's terms, the closer the values of the two pixels are. Therefore, people can use this correlation between pixels to perform compression coding. This compression method is called intra-frame prediction coding. Not only that, the correlation between adjacent frames is generally stronger than the correlation between pixels within a frame, and the compression ratio is also greater. It can be seen that by using the correlation between pixels (intra-frame) and the correlation between frames, that is, finding the corresponding reference pixel or reference frame as the predicted value, video compression coding can be realized.

2. Transform coding
A large number of statistics show that the video signal contains the most energy-intensive DC and low-frequency components, that is, the flat part of the image, and a small amount of high-frequency components, that is, the details of the image. Therefore, another method can be used for video encoding. After the image undergoes a certain mathematical transformation, the image in the transformed domain is obtained (as shown in the figure), where u and v are the spatial frequency coordinates respectively.

Transform coding

3. Waveform-based coding
Waveform-based coding uses a block-based hybrid coding method that combines predictive coding and transform coding. In order to reduce the coding complexity and make the video coding operation easier to perform, when using the hybrid coding method, first divide an image into blocks of fixed size, such as block 8×8 (that is, 8 rows per block, 8 pixels per row), Block 16×16 (16 lines per block, 16 pixels per line) and so on, and then compress and encode the block.

Since ITU-T released the first digital video coding standard-H.261 in 1989, it has successively released video coding standards such as H.263 and multimedia terminal standards such as H.320 and H.323. The Moving Picture Experts Group (MPEG) under ISO has defined MPEG-1, MPEG-2, MPEG-4 and other entertainment and digital TV compression coding international standards.
In March 2003, ITU-T promulgated the H.264 video coding standard. It not only makes video compression significantly improved compared with previous standards, but also has good network affinity, especially for IP Internet, wireless mobile network and other network video transmission performance that is easy to error, easy to block, and not easy to guarantee QoS. . All of these video coding uses block-based hybrid coding, which are all waveform-based coding.

4. Content-based coding
There is also a content-based encoding technology, where the video frame is first divided into regions corresponding to different objects, and then encoded. Specifically, it encodes the shape, motion, and texture of different objects. In the simplest case, a two-dimensional outline is used to describe the shape of an object, a motion vector is used to describe its motion state, and a texture is described by a color waveform.

When the types of objects in the video sequence are known, knowledge-based or model-based coding can be used. For example, for human faces, some predefined wireframes have been developed to encode the features of the face. At this time, the coding efficiency is very high, and only a few bits are needed to describe its features. For facial expressions (such as angry, happy, etc.), possible behaviors can be coded by semantics. Since the number of possible behaviors of an object is very small, very high coding efficiency can be obtained.

The coding method adopted by MPEG-4 is both block-based hybrid coding and content-based coding method.

5. Soft and hard knitting
There are two ways to implement video coding on the Android platform, one is soft coding and the other is hard coding. For soft editing, it often relies on the cpu and uses the computing power of the cpu to perform coding. For example, we can download the x264 encoding library, write the relevant jni interface, and then pass in the corresponding image data. After processing by the x264 library, the original image is converted into a video in h264 format.

The hard code uses the MediaCodec provided by Android itself. To use MediaCodec, you need to pass in the corresponding data. These data can be yuv image information or a Surface. Surface is generally recommended, which is more efficient. Surface directly uses local video data buffers without mapping or copying them to ByteBuffers; therefore, this approach will be more efficient. When using Surface, you can usually not directly access the original video data, but you can use the ImageReader class to access unreliable decoded (or original) video frames. This may still be more efficient than using ByteBuffers, because some local buffers can be mapped to direct ByteBuffers. When using ByteBuffer mode, you can use the Image class and getInput/OutputImage(int) methods to access the original video data frame.

Note: The following article will specifically describe how to perform video encoding

Audio coding

AudioRecord can be used in Android to record sound, and the recorded sound is PCM sound. If you want to express the sound in computer language, you must digitize the sound. The most common way to digitize sound is through Pulse Code Modulation (PCM). The sound passes through the microphone and is converted into a series of signals of voltage changes. The way to convert such a signal into PCM format is to use three parameters to represent the sound. They are: the number of channels, the number of sampling bits, and the sampling frequency.

1. Sampling frequency
That is, the sampling frequency, which refers to the number of times a sound sample is obtained per second. The higher the sampling frequency, the better the sound quality and the more realistic the sound reproduction, but at the same time it occupies more resources. Due to the limited resolution of the human ear, too high a frequency cannot be distinguished. There are 22KHz, 44KHz and other levels in 16-bit sound cards. Among them, 22KHz is equivalent to the sound quality of ordinary FM broadcasting, and 44KHz is equivalent to the sound quality of CD. The current common sampling frequency does not exceed 48KHz.

2. Number of sampling bits
That is, the sampling value or sampling value (that is, the amplitude of the sampling sample is quantized). It is a parameter used to measure the fluctuation of the sound, and it can also be said to be the resolution of the sound card. The larger its value, the higher the resolution and the stronger the sound power.
In the computer, the number of sampling bits is generally 8 bits and 16 bits, but please note that 8 bits does not mean dividing the ordinate into 8 parts, but divided into 2 to the 8th power, which is 256 parts; the same is true for 16 bits. It divides the ordinate into 2 to the 16th power of 65,536.

3. Number of channels
It is easy to understand that there are monophonic and stereophonic. Monophonic sound can only be produced by one speaker (some are also processed into two speakers to output the same channel sound), and stereo pcm can make two speakers Both sound (generally there is a division of labor between the left and right channels), so you can feel the spatial effect more.

So, now we can get the formula for the capacity of the pcm file:
Storage capacity = (sampling frequency ✖️ number of sampling bits ✖️ channel ✖️ time) ➗ 8 (unit: number of bytes)

If the audio is all transmitted in the PCM format, the occupied bandwidth is relatively large, so the audio needs to be encoded before transmission.

There are already some widely used sound formats, such as wav, MIDI, MP3, WMA, AAC, Ogg, etc. Compared with the pcm format, these formats compress the sound data, which can reduce the transmission bandwidth.

The audio coding can also be divided into two types: soft coding and hard coding. For soft editing, download the corresponding coding library, write the corresponding jni, and then pass in the data for coding. The hard code uses the MediaCodec provided by Android itself.

Note: The following article will specifically describe how to perform audio encoding

6, packaging
The video and audio need to define the corresponding format during the transmission process, so that it can be correctly parsed when it is transmitted to the opposite end.

1. HTTP-FLV
In the Web 2.0 era, the most popular types of websites are naturally Youtube from abroad, Youku and Tudou websites in China. The video content provided by such sites can be said to have their own merits, but they all use Flash as a video playback carrier without exception. The technical basis supporting these video sites is Flash Video (FLV). FLV is a brand new streaming media video format, which utilizes the widely used Flash Player platform on web pages to integrate video into Flash animation. In other words, as long as visitors to the website can watch Flash animations, they can naturally watch FLV format videos without the need to install additional video plug-ins. The use of FLV videos brings great convenience to video dissemination.

HTTP-FLV encapsulates audio and video data into FLV, and then transmits it to the client through the HTTP protocol. As the uploader, only the video and audio in FLV format need to be transmitted to the server.

Generally speaking, the video and audio in the FLV format generally use the h264 format for the video, and the audio generally uses the AAC-LC format.

The FLV format is to first transmit the FLV header information, then transmit the metadata with the video and audio parameters (Metadata), then transmit the video and audio parameter information, and then transmit the video and audio data.
Note: The following article will describe FLV in detail

2. RTMP
RTMP is the acronym for Real Time Messaging Protocol. The protocol is based on TCP and is a protocol cluster, including RTMP basic protocol and RTMPT/RTMPS/RTMPE and many other variants. RTMP is a network protocol designed for real-time data communication. It is mainly used for audio, video and data communication between the Flash/AIR platform and a streaming media/interactive server that supports the RTMP protocol.

The RTMP protocol is a real-time transmission protocol launched by Adobe, which is mainly used for real-time transmission of audio and video streams based on the flv format. After getting the encoded video and audio data, FLV packaging is required first, and then packaged into rtmp format, and then transmitted.

To use RTMP format for transmission, you need to connect to the server first, then create a stream, then publish the stream, and then transmit the corresponding video and audio data. The entire transmission is defined by messages, rtmp defines various forms of messages, and in order to send the messages well, the messages are divided into blocks, which makes the entire protocol more complicated.

Note:aLater articles will describe RTMP in detail

There are also several other forms of protocols, such as RTP, etc. The general principles are similar, so I will not explain them one by one.

7. poor network processing
The video and audio can be sent in time under a good network, without causing the accumulation of video and audio data locally, the live broadcast effect is smooth, and the delay is small. In a bad network environment, if the audio and video data cannot be sent out, we need to process the audio and video data. There are generally four processing methods for video and audio data in a poor network environment: buffer design, network detection, frame loss processing, and bit rate reduction processing.

1. Buffer design
Video and audio data is transferred to the buffer, and the sender gets the data from the buffer and sends it, thus forming an asynchronous producer-consumer mode. The producer only needs to push the collected and encoded video and audio data to the buffer, and the consumer is responsible for taking out the data from the buffer and sending it.

Video and audio buffer

Only the video frame is shown in the figure above, and obviously there are corresponding audio frames inside. To build an asynchronous producer-consumer model, Java has provided a good class. Since frame loss, insertion, removal, etc. need to be processed later, it is obvious that LinkedBlockingQueue is a very good choice.

2. Network detection
An important process in the process of poor network processing is network detection. When the network becomes poor, it can be quickly detected and then processed accordingly. This will make the network response more sensitive and the effect will be much better.

We calculate the data in the input buffer per second and the data sent out in real time. If the data sent out is smaller than the data in the input buffer, then the network bandwidth is not good. At this time, the data in the buffer will continue to increase. Activate the corresponding mechanism.

3. Drop frame processing
When network degradation is detected, frame loss is a good response mechanism. After the video is encoded, there are key frames and non-key frames. The key frame is a complete picture, and the non-key frame describes the relative change of the image.
The frame dropping strategy can be defined by itself. One thing to note is: if you want to drop P frames (non-key frames), you need to drop all non-key frames between the two key frames, otherwise mosaics will occur. The design of the frame loss strategy varies depending on the needs, and you can design it yourself.

4. Code reduction rate
In Android, if hard coding is used for encoding, in a poor network environment, we can change the bit rate of hard coding in real time to make the live broadcast smoother. When it is detected that the network environment is poor, we can also reduce the video and audio bit rate while dropping frames. When the Android sdk version is greater than or equal to 19, you can pass parameters to MediaCodec to change the bit rate of the data from the hard-coded encoder.

Bundle bitrate = new Bundle();bitrate.putInt(MediaCodec.PARAMETER_KEY_VIDEO_BITRATE, bps * 1024);
mMediaCodec.setParameters(bitrate);

8. send
After various processing, the data needs to be sent out finally, this step is relatively simple. Whether it is HTTP-FLV or RTMP, we use TCP to establish a connection. Before the live broadcast, you need to connect to the server through the Socket to verify whether you can connect to the server. After the connection, use this Socket to send data to the server, and close the Socket after the data is sent.

How far(long) the transmitter cover?

The transmission range depends on many factors. The true distance is based on the antenna installing height , antenna gain, using environment like building and other obstructions , sensitivity of the receiver, antenna of the receiver . Installing antenna more high and using in the countryside , the distance will much more far.

EXAMPLE 5W FM Transmitter use in the city and hometown:

I have a USA customer use 5W fm transmitter with GP antenna in his hometown ,and he test it with a car, it cover 10km(6.21mile).

I test the 5W fm transmitter with GP antenna in my hometown ,it cover about 2km(1.24mile).

I test the 5W fm transmitter with GP antenna in Guangzhou city ,it cover about only 300meter(984ft).

Below are the approximate range of different power FM Transmitters. ( The range is diameter )

0.1W ~ 5W FM Transmitter ：100M ~1KM

5W ~15W FM Ttransmitter : 1KM ~ 3KM

15W ~ 80W FM Transmitter : 3KM ~10KM

80W ~500W FM Transmitter : 10KM ~30KM

500W ~1000W FM Transmitter : 30KM ~ 50KM

1KW ~ 2KW FM Transmitter : 50KM ~100KM

2KW ~5KW FM Transmitter : 100KM ~150KM

5KW ~10KW FM Transmitter : 150KM ~200KM

How to contact us for the transmitter?

Call me +8618078869184 OR
Email me [email protected]
1.How far you want to cover in diameter ?
2.How tall of you tower ?
3.Where are you from ?
And we will give you more professional advice.

About Us

FMUSER.ORG is a system integration company focusing on RF wireless transmission / studio video audio equipment / streaming and data processing .We are providing everything from advice and consultancy through rack integration to installation, commissioning and training.

We offer FM Transmitter, Analog TV Transmitter, Digital TV transmitter, VHF UHF Transmitter, Antennas, Coaxial Cable Connectors, STL, On Air Processing, Broadcast Products for the Studio, RF Signal Monitoring, RDS Encoders, Audio Processors and Remote Site Control Units, IPTV Products, Video / Audio Encoder / Decoder, designed to meet the needs of both large international broadcast networks and small private stations alike.

Our solution has FM Radio Station / Analog TV Station / Digital TV Station / Audio Video Studio Equipment / Studio Transmitter Link / Transmitter Telemetry System / Hotel TV System / IPTV Live Broadcasting / Streaming Live Broadcast / Video Conference / CATV Broadcasting system.

We are using advanced technology products for all the systems, because we know the high reliability and high performance are so important for the system and solution . At the same time we also have to make sure our products system with a very reasonable price.

We have customers of public and commercial broadcasters, telecom operators and regulation authorities , and we also offer solution and products to many hundreds of smaller, local and community broadcasters .

FMUSER.ORG has been exporting more than 15 years and have clients all over the world. With 13 years experience in this field ,we have a professional team to solve customer's all kinds of problems. We dedicated in supplying the extremely reasonable pricing of professional products & services. Contact email : [email protected]

Our Factory

We have modernization of the factory . You are welcome to visit our factory when you come to China.

At present , there are already 1095 customers around the world visited our Guangzhou Tianhe office . If you come to China , you are welcome to visit us .

At Fair

This is our participation in 2012 Global Sources Hong Kong Electronics Fair . Customers from all over the world finally have a chance to get together.

Where is Fmuser ?

You can search this numbers " 23.127460034623816,113.33224654197693 " in google map , then you can find our fmuser office .

FMUSER Guangzhou office is in Tianhe District which is the center of the Canton . Very near to the Canton Fair , guangzhou railway station, xiaobei road and dashatou , only need 10 minutes if take TAXI . Welcome friends around the world to visit and negotiate .

Contact: Sky Blue
Cellphone: +8618078869184
WhatsApp: +8618078869184
Wechat: +8618078869184
E-mail: [email protected]
QQ: 727926717
Skype: sky198710021
Address: No.305 Room HuiLan Building No.273 Huanpu Road Guangzhou China Zip:510620

English: We accept all payments , such as PayPal, Credit Card, Western Union, Alipay, Money Bookers, T/T, LC, DP, DA, OA, Payoneer, If you have any question , please contact me [email protected] or WhatsApp +8618078869184

PayPal. www.paypal.com

We recommend you use Paypal to buy our items ,The Paypal is a secure way to buy on internet .

Every of our item list page bottom on top have a paypal logo to pay.

Credit Card.If you do not have paypal,but you have credit card,you also can click the Yellow PayPal button to pay with your credit card.

---------------------------------------------------------------------

But if you have not a credit card and not have a paypal account or difficult to got a paypal accout ,You can use the following:

Western Union.

www.westernunion.com

Pay by Western Union to me :

First name/Given name: Yingfeng
Last name/Surname/ Family name: Zhang
Full name: Yingfeng Zhang
Country: China
City: Guangzhou

---------------------------------------------------------------------

T/T . Pay by T/T (wire transfer/ Telegraphic Transfer/ Bank Transfer)

First BANK INFORMATION (COMPANY ACCOUNT):

SWIFT BIC: BKCHHKHHXXX
Bank name: BANK OF CHINA (HONG KONG) LIMITED, HONG KONG
Bank Address: BANK OF CHINA TOWER, 1 GARDEN ROAD, CENTRAL, HONG KONG
BANK CODE: 012
Account Name : FMUSER INTERNATIONAL GROUP LIMITED
Account NO. : 012-676-2-007855-0

---------------------------------------------------------------------
Second BANK INFORMATION (COMPANY ACCOUNT):
Beneficiary: Fmuser International Group Inc
Account Number: 44050158090900000337
Beneficiary's Bank: China Construction Bank Guangdong Branch
SWIFT Code: PCBCCNBJGDX
Address: NO.553 Tianhe Road, Guangzhou, Guangdong,Tianhe District, China
**Note: When you transfer money to our bank account, please DO NOT write anything in the remark area, otherwise we won't be able to receive the payment due to government policy on international trade business.

* It will be sent in 1-2 working days when payment clear.

* We will send it to your paypal address. If you want to change address, please send your correct address and phone number to my email [email protected]

* If the packages is below 2kg,we will be shipped via post airmail, it will take about 15-25days to your hand.

If the package is more than 2kg,we will ship via EMS , DHL , UPS, Fedex fast express delivery,it will take about 7~15days to your hand.

If the package more than 100kg , we will send via DHL or air freight. It will take about 3~7days to your hand.

All the packages are form China guangzhou.

* Package will be sent as a "gift" and declear as less as possible,buyer don't need to pay for "TAX".

* After ship, we will send you an E-mail and give you the tracking number.

For Warranty .
Contact US--->>Return the item to us--->>Receive and send another replace .

Name: Liu xiaoxia
Address: 305Fang HuiLanGe HuangPuDaDaoXi 273Hao TianHeQu Guangzhou China.
ZIP:510620
Phone: +8618078869184

Please return to this address and write your paypal address,name,problem on note:

List all Question

<< DTV (digital television) knowledgeOverview of VLAN functions >>

Our other product:

Professional lightning protection system for Tower

Professional FM Radio Station Equipment Package

FMUSER FU-1000C 1000W FM Transmitter With USB Audio Input Interface

FMUSER 200W FM Radio Broadcasting Package

Hotel IPTV Solution

FMUSER FSN-1000T V5.0 1KW FM Radio Broadcast Transmitter