引言
在本文中,我们将继续介绍一些对 WebRTC Native Lib 的覆写过程,主要涉及如何限制端口的使用以及如何重写编解码过程。其他在 Java 中使用 WebRTC 的经验均收录于
<在 Java 中使用 WebRTC>
中,对这个方向感兴趣的同学可以翻阅一下。本文源代码可通过扫描文章下方的公众号获取或
付费下载
。
限制连接端口
回顾一下之前进行端口限制的完成流程,在
创建PeerConnectionFactory
的时候,我们实例化了一个SocketFactory和一个默认的NetworkManager,随后在
创建PeerConnection
的时候,我们通过这两个实例创建了一个PortAllocator,并将这个PortAllocator注入到PeerConnection中。整个流程中,真正做端口限制的代码都在SocketFactory中,当然,也用到了PortAllocator的API。这里你可能会有疑问,PortAllocator中不是有接口可以限制端口范围吗,怎么还需要SocketFactory?
std::unique_ptr<cricket::PortAllocator> port_allocator(
new cricket::BasicPortAllocator(network_manager.get(), socket_factory.get()));
port_allocator->SetPortRange(this->min_port, this->max_port); // Port allocator的端口限制API
我当时也是只通过这个API设置了端口,但是我发现它还是会申请限制之外的端口来做一些别的事情,所以最后我直接复写了SocketFactory,将所有非法端口的申请都给禁掉了,此外因为我们的服务器上还有一些不能用的子网IP,我也在SocketFactory中进行了处理,我的实现内容如下:
rtc::AsyncPacketSocket *
rtc::SocketFactoryWrapper::CreateUdpSocket(const rtc::SocketAddress &local_address, uint16_t min_port,
uint16_t max_port) {
// 端口非法判断
if (min_port < this->min_port || max_port > this->max_port) {
WEBRTC_LOG("Create udp socket cancelled, port out of range, expect port range is:" +
std::to_string(this->min_port) + "->" + std::to_string(this->max_port)
+ "parameter port range is: " + std::to_string(min_port) + "->" + std::to_string(max_port),
LogLevel::INFO);
return nullptr;
// IP非法判断
if (!local_address.IsPrivateIP() || local_address.HostAsURIString().find(this->white_private_ip_prefix) == 0) {
rtc::AsyncPacketSocket *result = BasicPacketSocketFactory::CreateUdpSocket(local_address, min_port, max_port);
const auto *address = static_cast<const void *>(result);
std::stringstream ss;
ss << address;
WEBRTC_LOG("Create udp socket, min port is:" + std::to_string(min_port) + ", max port is: " +
std::to_string(max_port) + ", result is: " + result->GetLocalAddress().ToString() + "->" +
result->GetRemoteAddress().ToString() + ", new socket address is: " + ss.str(), LogLevel::INFO);
return result;
} else {
WEBRTC_LOG("Create udp socket cancelled, this ip is not in while list:" + local_address.HostAsURIString(),
LogLevel::INFO);
return nullptr;
}
自定义视频编码
您可能已经知道了,WebRTC技术默认是使用VP8进行编码的,而普遍的观点是VP8并没有H264好。此外Safari是不支持VP8编码的,所以在与Safari进行通讯的时候WebRTC使用的是OpenH264进行视频编码,而OpenH264效率又没有libx264高,所以我对编码部分的改善主要就集中在:
-
替换默认编码方案为H264
-
基于FFmpeg使用libx264进行视频编码,并且当宿主机有较好的GPU时我会使用GPU进行加速(h264_nvenc)
-
支持运行时修改传输比特率
替换默认编码
替换默认编码方案为H264比较简单,我们只需要复写VideoEncoderFactory的
GetSupportedFormats
:
// Returns a list of supported video formats in order of preference, to use
// for signaling etc.
std::vector<webrtc::SdpVideoFormat> GetSupportedFormats() const override {
return GetAllSupportedFormats();
// 这里我设置了只支持H264编码,打包模式为NonInterleaved
std::vector<webrtc::SdpVideoFormat> GetAllSupportedFormats() {
std::vector<webrtc::SdpVideoFormat> supported_codecs;
supported_codecs.emplace_back(CreateH264Format(webrtc::H264::kProfileBaseline, webrtc::H264::kLevel3_1, "1"));
return supported_codecs;
webrtc::SdpVideoFormat CreateH264Format(webrtc::H264::Profile profile,
webrtc::H264::Level level,
const std::string &packetization_mode) {
const absl::optional<std::string> profile_string =
webrtc::H264::ProfileLevelIdToString(webrtc::H264::ProfileLevelId(profile, level));
RTC_CHECK(profile_string);
return webrtc::SdpVideoFormat(cricket::kH264CodecName,
{{cricket::kH264FmtpProfileLevelId, *profile_string},
{cricket::kH264FmtpLevelAsymmetryAllowed, "1"},
{cricket::kH264FmtpPacketizationMode, packetization_mode}});
}
实现编码器
然后是基于FFmpeg对
VideoEncoder
接口的实现,对FFmpeg的使用我主要参考了
官方Example
。然后简单看看我们需要实现VideoEncoder的什么接口吧:
FFmpegH264EncoderImpl(const cricket::VideoCodec &codec, bool hardware_accelerate);
~FFmpegH264EncoderImpl() override;
// |max_payload_size| is ignored.
// The following members of |codec_settings| are used. The rest are ignored.
// - codecType (must be kVideoCodecH264)
// - targetBitrate
// - maxFramerate
// - width
// - height
// 初始化编码器
int32_t InitEncode(const webrtc::VideoCodec *codec_settings,
int32_t number_of_cores,
size_t max_payload_size) override;
// 释放资源
int32_t Release() override;
// 当我们编码完成时,通过该回调上交视频帧
int32_t RegisterEncodeCompleteCallback(
webrtc::EncodedImageCallback *callback) override;
// WebRTC自己的码率控制器,它会根据当前网络情况,修改码率
int32_t SetRateAllocation(const webrtc::VideoBitrateAllocation &bitrate_allocation,
uint32_t framerate) override;
// The result of encoding - an EncodedImage and RTPFragmentationHeader - are
// passed to the encode complete callback.
int32_t Encode(const webrtc::VideoFrame &frame,
const webrtc::CodecSpecificInfo *codec_specific_info,
const std::vector<webrtc::FrameType> *frame_types) override;
在实现这个接口时,参考了WebRTC官方的OpenH264Encoder,需要注意的是WebRTC是能支持Simulcast的,所以这个的编码实例可能会有多个,也就是说一个Stream对应一个编码实例。接下来,我讲逐步讲解我的实现方案,因为这个地方比较复杂。
先介绍一下我这里定义的结构体和成员变量吧:
// 用该结构体保存一个编码实例的所有相关资源
typedef struct {
AVCodec *codec = nullptr; //指向编解码器实例
AVFrame *frame = nullptr; //保存解码之后/编码之前的像素数据
AVCodecContext *context = nullptr; //编解码器上下文,保存编解码器的一些参数设置
AVPacket *pkt = nullptr; //码流包结构,包含编码码流数据
} CodecCtx;
// 编码器实例
std::vector<CodecCtx *> encoders_;
// 编码器参数
std::vector<LayerConfig> configurations_;
// 编码完成后的图片
std::vector<webrtc::EncodedImage> encoded_images_;
// 图片缓存部分
std::vector<std::unique_ptr<uint8_t[]>> encoded_image_buffers_;
// 编码相关配置
webrtc::VideoCodec codec_;
webrtc::H264PacketizationMode packetization_mode_;
size_t max_payload_size_;
int32_t number_of_cores_;
// 编码完成后的回调
webrtc::EncodedImageCallback *encoded_image_callback_;
构造函数部分比较简单,就是保存打包格式,以及申请空间:
FFmpegH264EncoderImpl::FFmpegH264EncoderImpl(const cricket::VideoCodec &codec, bool hardware)
: packetization_mode_(webrtc::H264PacketizationMode::SingleNalUnit),
max_payload_size_(0),
hardware_accelerate(hardware),
number_of_cores_(0),
encoded_image_callback_(nullptr),
has_reported_init_(false),
has_reported_error_(false) {
RTC_CHECK(cricket::CodecNamesEq(codec.name, cricket::kH264CodecName));
std::string packetization_mode_string;
if (codec.GetParam(cricket::kH264FmtpPacketizationMode,
&packetization_mode_string) &&
packetization_mode_string == "1") {
packetization_mode_ = webrtc::H264PacketizationMode::NonInterleaved;
encoded_images_.reserve(webrtc::kMaxSimulcastStreams);
encoded_image_buffers_.reserve(webrtc::kMaxSimulcastStreams);
encoders_.reserve(webrtc::kMaxSimulcastStreams);
configurations_.reserve(webrtc::kMaxSimulcastStreams);
}
然后是非常关键的初始化编码器过程,在这里我先是进行了一个检查,然后对每一个Stream创建相应的编码器实例:
int32_t FFmpegH264EncoderImpl::InitEncode(const webrtc::VideoCodec *inst,
int32_t number_of_cores,
size_t max_payload_size) {
ReportInit();
if (!inst || inst->codecType != webrtc::kVideoCodecH264) {
ReportError();
return WEBRTC_VIDEO_CODEC_ERR_PARAMETER;
if (inst->maxFramerate == 0) {
ReportError();
return WEBRTC_VIDEO_CODEC_ERR_PARAMETER;
if (inst->width < 1 || inst->height < 1) {
ReportError();
return WEBRTC_VIDEO_CODEC_ERR_PARAMETER;
int32_t release_ret = Release();
if (release_ret != WEBRTC_VIDEO_CODEC_OK) {
ReportError();
return release_ret;
int number_of_streams = webrtc::SimulcastUtility::NumberOfSimulcastStreams(*inst);
bool doing_simulcast = (number_of_streams > 1);
if (doing_simulcast && (!webrtc::SimulcastUtility::ValidSimulcastResolutions(
*inst, number_of_streams) ||
!webrtc::SimulcastUtility::ValidSimulcastTemporalLayers(
*inst, number_of_streams))) {
return WEBRTC_VIDEO_CODEC_ERR_SIMULCAST_PARAMETERS_NOT_SUPPORTED;
encoded_images_.resize(static_cast<unsigned long>(number_of_streams));
encoded_image_buffers_.resize(static_cast<unsigned long>(number_of_streams));
encoders_.resize(static_cast<unsigned long>(number_of_streams));
configurations_.resize(static_cast<unsigned long>(number_of_streams));
for (int i = 0; i < number_of_streams; i++) {
encoders_[i] = new CodecCtx();
number_of_cores_ = number_of_cores;
max_payload_size_ = max_payload_size;
codec_ = *inst;
// Code expects simulcastStream resolutions to be correct, make sure they are
// filled even when there are no simulcast layers.
if (codec_.numberOfSimulcastStreams == 0) {
codec_.simulcastStream[0].width = codec_.width;
codec_.simulcastStream[0].height = codec_.height;
for (int i = 0, idx = number_of_streams - 1; i < number_of_streams;
++i, --idx) {
// Temporal layers still not supported.
if (inst->simulcastStream[i].numberOfTemporalLayers > 1) {
Release();
return WEBRTC_VIDEO_CODEC_ERR_SIMULCAST_PARAMETERS_NOT_SUPPORTED;
// Set internal settings from codec_settings
configurations_[i].simulcast_idx = idx;
configurations_[i].sending = false;
configurations_[i].width = codec_.simulcastStream[idx].width;
configurations_[i].height = codec_.simulcastStream[idx].height;
configurations_[i].max_frame_rate = static_cast<float>(codec_.maxFramerate);
configurations_[i].frame_dropping_on = codec_.H264()->frameDroppingOn;
configurations_[i].key_frame_interval = codec_.H264()->keyFrameInterval;
// Codec_settings uses kbits/second; encoder uses bits/second.
configurations_[i].max_bps = codec_.maxBitrate * 1000;
configurations_[i].target_bps = codec_.startBitrate * 1000;
if (!OpenEncoder(encoders_[i], configurations_[i])) {
Release();
ReportError();
return WEBRTC_VIDEO_CODEC_ERROR;
// Initialize encoded image. Default buffer size: size of unencoded data.
encoded_images_[i]._size =
CalcBufferSize(webrtc::VideoType::kI420, codec_.simulcastStream[idx].width,
codec_.simulcastStream[idx].height);
encoded_images_[i]._buffer = new uint8_t[encoded_images_[i]._size];
encoded_image_buffers_[i].reset(encoded_images_[i]._buffer);
encoded_images_[i]._completeFrame = true;
encoded_images_[i]._encodedWidth = codec_.simulcastStream[idx].width;
encoded_images_[i]._encodedHeight = codec_.simulcastStream[idx].height;
encoded_images_[i]._length = 0;
webrtc::SimulcastRateAllocator init_allocator(codec_);
webrtc::BitrateAllocation allocation = init_allocator.GetAllocation(
codec_.startBitrate * 1000, codec_.maxFramerate);
return SetRateAllocation(allocation, codec_.maxFramerate);
// OpenEncoder函数是创建编码器的过程,这个函数中有一个隐晦的点是创建AVFrame时一定要记得设置为32内存对齐,这个之前我们在采集图像数据的时候提过
bool FFmpegH264EncoderImpl::OpenEncoder(FFmpegH264EncoderImpl::CodecCtx *ctx, H264Encoder::LayerConfig &config) {
int ret;
/* find the mpeg1 video encoder */
#ifdef WEBRTC_LINUX
if (hardware_accelerate) {
ctx->codec = avcodec_find_encoder_by_name("h264_nvenc");
#endif
if (!ctx->codec) {
ctx->codec = avcodec_find_encoder_by_name("libx264");
if (!ctx->codec) {
WEBRTC_LOG("Codec not found", ERROR);
return false;
WEBRTC_LOG("Open encoder: " + std::string(ctx->codec->name) + ", and generate frame, packet", INFO);
ctx->context = avcodec_alloc_context3(ctx->codec);
if (!ctx->context) {
WEBRTC_LOG("Could not allocate video codec context", ERROR);
return false;
config.target_bps = config.max_bps;
SetContext(ctx, config, true);
/* open it */
ret = avcodec_open2(ctx->context, ctx->codec, nullptr);
if (ret < 0) {
WEBRTC_LOG("Could not open codec, error code:" + std::to_string(ret), ERROR);
avcodec_free_context(&(ctx->context));
return false;
ctx->frame = av_frame_alloc();
if (!ctx->frame) {
WEBRTC_LOG("Could not allocate video frame", ERROR);
return false;
ctx->frame->format = ctx->context->pix_fmt;
ctx->frame->width = ctx->context->width;
ctx->frame->height = ctx->context->height;
ctx->frame->color_range = ctx->context->color_range;
/* the image can be allocated by any means and av_image_alloc() is
* just the most convenient way if av_malloc() is to be used */
ret = av_image_alloc(ctx->frame->data, ctx->frame->linesize, ctx->context->width, ctx->context->height,
ctx->context->pix_fmt, 32);
if (ret < 0) {
WEBRTC_LOG("Could not allocate raw picture buffer", ERROR);
return false;
ctx->frame->pts = 1;
ctx->pkt = av_packet_alloc();
return true;
// 设置FFmpeg编码器的参数
void FFmpegH264EncoderImpl::SetContext(CodecCtx *ctx, H264Encoder::LayerConfig &config, bool init) {
if (init) {
AVRational rational = {1, 25};
ctx->context->time_base = rational;
ctx->context->max_b_frames = 0;
ctx->context->pix_fmt = AV_PIX_FMT_YUV420P;
ctx->context->codec_type = AVMEDIA_TYPE_VIDEO;
ctx->context->codec_id = AV_CODEC_ID_H264;
ctx->context->gop_size = config.key_frame_interval;
ctx->context->color_range = AVCOL_RANGE_JPEG;
// 设置两个参数让编码过程更快
if (std::string(ctx->codec->name) == "libx264") {
av_opt_set(ctx->context->priv_data, "preset", "ultrafast", 0);
av_opt_set(ctx->context->priv_data, "tune", "zerolatency", 0);
av_log_set_level(AV_LOG_ERROR);
WEBRTC_LOG("Init bitrate: " + std::to_string(config.target_bps), INFO);
} else {
WEBRTC_LOG("Change bitrate: " + std::to_string(config.target_bps), INFO);
config.key_frame_request = true;
ctx->context->width = config.width;
ctx->context->height = config.height;
ctx->context->bit_rate = config.target_bps * 0.7;
ctx->context->rc_max_rate = config.target_bps * 0.85;
ctx->context->rc_min_rate = config.target_bps * 0.1;
ctx->context->rc_buffer_size = config.target_bps * 2; // buffer_size变化,触发libx264的码率编码,如果不设置这个前几条不生效
#ifdef WEBRTC_LINUX
if (std::string(ctx->codec->name) == "h264_nvenc") { // 使用类似于Java反射的思想,设置h264_nvenc的码率
NvencContext* nvenc_ctx = (NvencContext*)ctx->context->priv_data;
nvenc_ctx->encode_config.rcParams.averageBitRate = ctx->context->bit_rate;
nvenc_ctx->encode_config.rcParams.maxBitRate = ctx->context->rc_max_rate;
return;
#endif
}
SetContext中的最后几行,主要是关于如何动态设置编码器码率,这些内容应该是整个编码器设置过程中最硬核的部分了,我正是通过这些来实现libx264以及h264_nvenc的
运行时
码率控制。
讲完了初始化编码器这一大块内容,让我们来放松一下,先看两个简单的接口,一个是编码回调的注册,一个是WebRTC中码率控制模块的注入,前面提过WebRTC会根据网络情况设置编码的码率。
int32_t FFmpegH264EncoderImpl::RegisterEncodeCompleteCallback(
webrtc::EncodedImageCallback *callback) {
encoded_image_callback_ = callback;
return WEBRTC_VIDEO_CODEC_OK;
int32_t FFmpegH264EncoderImpl::SetRateAllocation(
const webrtc::BitrateAllocation &bitrate,
uint32_t new_framerate) {
if (encoders_.empty())
return WEBRTC_VIDEO_CODEC_UNINITIALIZED;
if (new_framerate < 1)
return WEBRTC_VIDEO_CODEC_ERR_PARAMETER;
if (bitrate.get_sum_bps() == 0) {
// Encoder paused, turn off all encoding.
for (auto &configuration : configurations_)
configuration.SetStreamState(false);
return WEBRTC_VIDEO_CODEC_OK;
// At this point, bitrate allocation should already match codec settings.
if (codec_.maxBitrate > 0)
RTC_DCHECK_LE(bitrate.get_sum_kbps(), codec_.maxBitrate);
RTC_DCHECK_GE(bitrate.get_sum_kbps(), codec_.minBitrate);
if (codec_.numberOfSimulcastStreams > 0)
RTC_DCHECK_GE(bitrate.get_sum_kbps(), codec_.simulcastStream[0].minBitrate);
codec_.maxFramerate = new_framerate;
size_t stream_idx = encoders_.size() - 1;
for (size_t i = 0; i < encoders_.size(); ++i, --stream_idx) {
// Update layer config.
configurations_[i].target_bps = bitrate.GetSpatialLayerSum(stream_idx);
configurations_[i].max_frame_rate = static_cast<float>(new_framerate);
if (configurations_[i].target_bps) {
configurations_[i].SetStreamState(true);
SetContext(encoders_[i], configurations_[i], false);
} else {
configurations_[i].SetStreamState(false);
return WEBRTC_VIDEO_CODEC_OK;
}
放松完了,让我们来看看最后一块难啃的骨头吧,没错,就是编码过程了,这块看似简单实则有个大坑。
int32_t FFmpegH264EncoderImpl::Encode(const webrtc::VideoFrame &input_frame,
const webrtc::CodecSpecificInfo *codec_specific_info,
const std::vector<webrtc::FrameType> *frame_types) {
// 先进行一些常规检查
if (encoders_.empty()) {
ReportError();
return WEBRTC_VIDEO_CODEC_UNINITIALIZED;
if (!encoded_image_callback_) {
RTC_LOG(LS_WARNING)
<< "InitEncode() has been called, but a callback function "
<< "has not been set with RegisterEncodeCompleteCallback()";
ReportError();
return WEBRTC_VIDEO_CODEC_UNINITIALIZED;
// 获取视频帧
webrtc::I420BufferInterface *frame_buffer = (webrtc::I420BufferInterface *) input_frame.video_frame_buffer().get();
// 检查下一帧是否需要关键帧,一般进行码率变化时,会设定下一帧发送关键帧
bool send_key_frame = false;
for (auto &configuration : configurations_) {
if (configuration.key_frame_request && configuration.sending) {
send_key_frame = true;
break;
if (!send_key_frame && frame_types) {
for (size_t i = 0; i < frame_types->size() && i < configurations_.size();
++i) {
if ((*frame_types)[i] == webrtc::kVideoFrameKey && configurations_[i].sending) {
send_key_frame = true;
break;
RTC_DCHECK_EQ(configurations_[0].width, frame_buffer->width());
RTC_DCHECK_EQ(configurations_[0].height, frame_buffer->height());
// Encode image for each layer.
for (size_t i = 0; i < encoders_.size(); ++i) {
// EncodeFrame input.
copyFrame(encoders_[i]->frame, frame_buffer);
if (!configurations_[i].sending) {
continue;
if (frame_types != nullptr) {
// Skip frame?
if ((*frame_types)[i] == webrtc::kEmptyFrame) {
continue;
// 控制编码器发送关键帧
if (send_key_frame || encoders_[i]->frame->pts % configurations_[i].key_frame_interval == 0) {
// API doc says ForceIntraFrame(false) does nothing, but calling this
// function forces a key frame regardless of the |bIDR| argument's value.
// (If every frame is a key frame we get lag/delays.)
encoders_[i]->frame->key_frame = 1;
encoders_[i]->frame->pict_type = AV_PICTURE_TYPE_I;
configurations_[i].key_frame_request = false;
} else {
encoders_[i]->frame->key_frame = 0;
encoders_[i]->frame->pict_type = AV_PICTURE_TYPE_P;
// Encode!编码过程
int got_output;
int enc_ret;
// 给编码器喂图片
enc_ret = avcodec_send_frame(encoders_[i]->context, encoders_[i]->frame);
if (enc_ret != 0) {
WEBRTC_LOG("FFMPEG send frame failed, returned " + std::to_string(enc_ret), ERROR);
ReportError();
return WEBRTC_VIDEO_CODEC_ERROR;
encoders_[i]->frame->pts++;
while (enc_ret >= 0) {
// 从编码器接受视频帧
enc_ret = avcodec_receive_packet(encoders_[i]->context, encoders_[i]->pkt);
if (enc_ret == AVERROR(EAGAIN) || enc_ret == AVERROR_EOF) {
break;
} else if (enc_ret < 0) {
WEBRTC_LOG("FFMPEG receive frame failed, returned " + std::to_string(enc_ret), ERROR);
ReportError();
return WEBRTC_VIDEO_CODEC_ERROR;
// 将编码器返回的帧转化为WebRTC需要的帧类型
encoded_images_[i]._encodedWidth = static_cast<uint32_t>(configurations_[i].width);
encoded_images_[i]._encodedHeight = static_cast<uint32_t>(configurations_[i].height);
encoded_images_[i].SetTimestamp(input_frame.timestamp());
encoded_images_[i].ntp_time_ms_ = input_frame.ntp_time_ms();
encoded_images_[i].capture_time_ms_ = input_frame.render_time_ms();
encoded_images_[i].rotation_ = input_frame.rotation();
encoded_images_[i].content_type_ =
(codec_.mode == webrtc::VideoCodecMode::kScreensharing)
? webrtc::VideoContentType::SCREENSHARE
: webrtc::VideoContentType::UNSPECIFIED;
encoded_images_[i].timing_.flags = webrtc::VideoSendTiming::kInvalid;
encoded_images_[i]._frameType = ConvertToVideoFrameType(encoders_[i]->frame);
// Split encoded image up into fragments. This also updates
// |encoded_image_|.
// 这里就是前面提到的大坑,FFmpeg编码出来的视频帧每个NALU之间可能以0001作为头,也会出现以001作为头的情况
// 而WebRTC只识别以0001作为头的NALU
// 所以我接下来要处理一下编码器输出的视频帧,并生成一个RTC报文的头部来描述该帧的数据
webrtc::RTPFragmentationHeader frag_header;
RtpFragmentize(&encoded_images_[i], &encoded_image_buffers_[i], *frame_buffer, encoders_[i]->pkt,
&frag_header);
av_packet_unref(encoders_[i]->pkt);
// Encoder can skip frames to save bandwidth in which case
// |encoded_images_[i]._length| == 0.
if (encoded_images_[i]._length > 0) {
// Parse QP.
h264_bitstream_parser_.ParseBitstream(encoded_images_[i]._buffer,
encoded_images_[i]._length);
h264_bitstream_parser_.GetLastSliceQp(&encoded_images_[i].qp_);
// Deliver encoded image.
webrtc::CodecSpecificInfo codec_specific;
codec_specific.codecType = webrtc::kVideoCodecH264;
codec_specific.codecSpecific.H264.packetization_mode =
packetization_mode_;
codec_specific.codecSpecific.H264.simulcast_idx = static_cast<uint8_t>(configurations_[i].simulcast_idx);
encoded_image_callback_->OnEncodedImage(encoded_images_[i],
&codec_specific, &frag_header);
return WEBRTC_VIDEO_CODEC_OK;
}
下面就是进行NAL转换以及提取RTP头部信息的过程:
// Helper method used by FFmpegH264EncoderImpl::Encode.
// Copies the encoded bytes from |info| to |encoded_image| and updates the
// fragmentation information of |frag_header|. The |encoded_image->_buffer| may
// be deleted and reallocated if a bigger buffer is required.
// After OpenH264 encoding, the encoded bytes are stored in |info| spread out
// over a number of layers and "NAL units". Each NAL unit is a fragment starting
// with the four-byte start code {0,0,0,1}. All of this data (including the
// start codes) is copied to the |encoded_image->_buffer| and the |frag_header|
// is updated to point to each fragment, with offsets and lengths set as to
// exclude the start codes.
void FFmpegH264EncoderImpl::RtpFragmentize(webrtc::EncodedImage *encoded_image,
std::unique_ptr<uint8_t[]> *encoded_image_buffer,
const webrtc::VideoFrameBuffer &frame_buffer, AVPacket *packet,
webrtc::RTPFragmentationHeader *frag_header) {
std::list<int> data_start_index;
std::list<int> data_length;
int payload_length = 0;
// 以001 或者 0001 作为开头的情况下,遍历出所有的NAL并记录NALU数据开始的下标和NALU数据长度
for (int i = 2; i < packet->size; i++) {
if (i > 2
&& packet->data[i - 3] == start_code[0]
&& packet->data[i - 2] == start_code[1]
&& packet->data[i - 1] == start_code[2]
&& packet->data[i] == start_code[3]) {
if (!data_start_index.empty()) {
data_length.push_back((i - 3 - data_start_index.back()));
data_start_index.push_back(i + 1);
} else if (packet->data[i - 2] == start_code[1] &&
packet->data[i - 1] == start_code[2] &&
packet->data[i] == start_code[3]) {
if (!data_start_index.empty()) {
data_length.push_back((i - 2 - data_start_index.back()));
data_start_index.push_back(i + 1);
if (!data_start_index.empty()) {
data_length.push_back((packet->size - data_start_index.back()));
for (auto &it : data_length) {
payload_length += +it;
// Calculate minimum buffer size required to hold encoded data.
auto required_size = payload_length + data_start_index.size() * 4;
if (encoded_image->_size < required_size) {
// Increase buffer size. Allocate enough to hold an unencoded image, this
// should be more than enough to hold any encoded data of future frames of
// the same size (avoiding possible future reallocation due to variations in
// required size).
encoded_image->_size = CalcBufferSize(
webrtc::VideoType::kI420, frame_buffer.width(), frame_buffer.height());
if (encoded_image->_size < required_size) {
// Encoded data > unencoded data. Allocate required bytes.
WEBRTC_LOG("Encoding produced more bytes than the original image data! Original bytes: " +
std::to_string(encoded_image->_size) + ", encoded bytes: " + std::to_string(required_size) + ".",
WARNING);
encoded_image->_size = required_size;
encoded_image->_buffer = new uint8_t[encoded_image->_size];
encoded_image_buffer->reset(encoded_image->_buffer);
// Iterate layers and NAL units, note each NAL unit as a fragment and copy
// the data to |encoded_image->_buffer|.
int index = 0;
encoded_image->_length = 0;
frag_header->VerifyAndAllocateFragmentationHeader(data_start_index.size());
for (auto it_start = data_start_index.begin(), it_length = data_length.begin();
it_start != data_start_index.end(); ++it_start, ++it_length, ++index) {
memcpy(encoded_image->_buffer + encoded_image->_length, start_code, sizeof(start_code));
encoded_image->_length += sizeof(start_code);
frag_header->fragmentationOffset[index] = encoded_image->_length;
memcpy(encoded_image->_buffer + encoded_image->_length, packet->data + *it_start,
static_cast<size_t>(*it_length));
encoded_image->_length += *it_length;
frag_header->fragmentationLength[index] = static_cast<size_t>(*it_length);
}
最后,是非常简单的编码器释放的过程:
int32_t FFmpegH264EncoderImpl::Release() {
while (!encoders_.empty()) {
CodecCtx *encoder = encoders_.back();
CloseEncoder(encoder);
encoders_.pop_back();
configurations_.clear();
encoded_images_.clear();
encoded_image_buffers_.clear();
return WEBRTC_VIDEO_CODEC_OK;
void FFmpegH264EncoderImpl::CloseEncoder(FFmpegH264EncoderImpl::CodecCtx *ctx) {
if (ctx) {
if (ctx->context) {
avcodec_close(ctx->context);
avcodec_free_context(&(ctx->context));
if (ctx->frame) {
av_frame_free(&(ctx->frame));
if (ctx->pkt) {
av_packet_free(&(ctx->pkt));
WEBRTC_LOG("Close encoder context and release context, frame, packet", INFO);
delete ctx;
}
至此,我对WebRTC的使用经历就已经介绍完了,希望我的经验能帮到大家。能坚持看完的童鞋,我真的觉得很不容易,我都一度觉得这篇文章写的太冗长,涉及的内容太多了。但是,因为各个部分的内容环环相扣,拆开来描述又怕思路会断。所以是以一条常规使用流程为主,中间依次引入一些我的改动内容,最后以附加项的形式详细介绍我对WebRTC Native APIs的改动。
而且,我也是近期才开始写文章来分享经验,可能比较词穷描述的不是很到位,希望大家海涵。如果哪位童鞋发现我有什么说的不对的地方,希望能留言告诉我,我会尽可能地及时作出处理的。
文章说明
更多有价值的文章均收录于
贝贝猫的文章目录
版权声明: 本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
创作声明: 本文基于下列所有参考内容进行创作,其中可能涉及复制、修改或者转换,图片均来自网络,如有侵权请联系我,我会第一时间进行删除。
参考内容
[1]
JNI的替代者—使用JNA访问Java外部功能接口
[2]
Linux共享对象之编译参数fPIC
[3]
Android JNI 使用总结
[4]
FFmpeg 仓库
Netty在实际应用级开发中,有时候某些特定场景下会需要使用Java对象类型进行传输,但是如果使用Java本身序列化进行传输,那么对性能的损耗比较大。为此我们需要借助protostuff-core的工具包将对象以二进制形式传输并做编码解码处理。与直接使用protobuf二进制传输方式不同,这里不需要定义proto文件,而是需要实现对象类型编码解码器,用以传输自定义Java对象。
最近一段时间的主要工作内容是开发一个远程控制手机的功能,其中音视频传输的部分是采用WebRTC技术来进行的,而我们的手机都是通过与其直接连接的Agent服务器进行管理,Agent服务是Java写的,现在市面上又没有合适的Java版WebRTC库,所以我就基于Google开源代码,写了一个JNI调用WebRTC Native的库。之前的一篇文章,我主要讲了讲我是怎么编译WebRTC的。这篇文章,我就来分享一下我是怎么在Java中使用WebRTC的,以及我根据业务需要对WebRTC的一些改动。
在前面的文章中,我已经介绍了如何使用 WebRTC 的 Native API,通过它们大家应该已经了解了正常 API 的一些使用方法和套路。从本文开始,我将介绍一下我这边对 Native API 默认实现的覆写过程,本文我们将先来介绍一些如何把 Java 中的音视频传输给 WebRTC Lib。
最新发布的《Java开发手册(嵩山版)》增加了前后端规约,其中有一条:禁止服务端在超大整数下使用Long类型作为返回。这是为何?在实际开发中可能出现什么问题?本文从IEEE754浮点数标准讲起,详细解析背后的原理,帮助大家彻底理解这个问题,提前避坑。