核心 SDK 由多个硬件加速器插件组成,这些插件使用各种加速器,如 VIC、GPU、DLA、NVDEC 和 NVENC。
DeepStream 支持边缘和云之间的安全双向通信,DeepStream 船舶使用用户名/密码和双向 TLS 身份验证等几种开箱即用的安全协议进行验证。
DeepStream 建立在 CUDA-X 堆栈(如 CUDA、TensorRT、Triton推理服务器和多媒体库)的多个 NVIDIA 库之上。TensorRT加速了NVIDIA GPU上的人工智能推理。DeepStream 在 DeepStream 插件中抽象了这些库,使开发人员无需学习所有单独的库即可轻松构建视频分析管道。
1.DeepStream 图形架构
DeepStream是使用开源 GStreamer 框架构建的优化图形架构。所有单独的块都是使用的各种插件,底部是在整个应用过程中使用的不同硬件引擎。
流式数据可以通过 RTSP 或本地文件系统或直接从摄像机传输到网络。流是使用 CPU 捕获的。一旦帧在内存中,它们就会被发送到使用 NVDEC 加速器进行解码,解码的插件称为Gst- nvvideo4linux2。
支持 H.264, H.265, JPEG, MPEG4, MPEG2, VP8, VP9 解码
支持H.264, H.265编码(nvv4l2h264enc、nvv4l2h265enc)
编解码器吞吐量
第二步是可选的图像预处理步骤,输入图像可以在推理前进行预处理,这些插件使用 GPU 或 VIC(视觉图像组合器)。
Gst-nvdewarper插件可以从鱼眼或 360 度摄像头减压图像。
Gst-nvvideoconvert插件可以在框架上执行颜色格式转换。
第三步是分批帧以获得最佳推理性能,批量使用Gst-nvstreammux插件完成。
第四步是将批量的帧进行推理,可以使用 TensorRT、NVIDIA 的推理加速器runtime完成,也可以在原生框架(如 TensorFlow 或PyTorch)使用 Triton 推理服务器进行。对于Jetson AGX Xavier 和 Xavier NX 可以使用GPU 或 DLA (Deep Learning accelerator) 。
原生 TensorRT 推理使用Gst-nvinfer插件执行
使用 Triton 进行推理使用Gst-nvinferserver插件执行
第五步是跟踪对象。SDK 中有几个内置的参考跟踪器,从高性能到高精度不等。对象跟踪使用Gst-nvtracker插件执行。
第六步是使用Gst-nvdsosd可视化插件创建可视化产品件,如边界框、分割面罩、标签。
第七步输出结果,DeepStream 提出了各种选项:
在屏幕上渲染输出边界框;
将输出保存到本地磁盘;
通过 RTSP 输出流式数据;
将元数据发送到云;
Gst-nvmsgconv 能将元数据转换为架构有效载荷
Gst-nvmsgbroker 能建立与云的连接并发送遥测数据。
有几个内置的broker协议,如Kafka,MQTT,AMQP和Azure物联网,也可以创建自定义broker适配器。
2.DeepStream SDK 5.1
DeepStream SDK 是一种加速 AI 框架,用于构建智能视频分析 (IVA) 管道。
DeepStream 5.1 for Servers and Workstations
This release supports Tesla T4 and V100.
DeepStream 5.1 for Jetson
This release supports Jetson TX1, TX2, Nano, NX and AGX Xavier.
3.Transfer Learning Toolkit
NVIDIA
TAO
(Train, Adapt, and Optimize)是一个可以简化和加速企业 AI 应用和服务创建的 AI 模型自适应平台。通过基于用户界面的指导性工作流程,让用户可以使用自定义数据对预训练模型进行微调,无需掌握大量训练运行和深度 AI 专业知识,在数小时内(原本需要数月)产生高度精确的计算机视觉、语音和语言理解模型。
迁移学习工具包TLT(Transfer Learning Toolkit)是NVIDIA TAO平台的核心组件,基于python的AI工具包,提供对预训练模型的迁移训练、模型剪纸、量化的一站式解决方案。
二、dGPU Setup for Ubuntu
Refer to NVIDIA GPU expansion card products such as NVIDIA Tesla® T4 and P4, NVIDIA GeForce® GTX 1080, and NVIDIA GeForce® RTX 2080.
1.安装依赖
Remove all previous DeepStream installations
cd /opt/nvidia/deepstream/deepstream/
sudo bash ./uninstall.sh
Install Dependencies
$ sudo apt install \
libssl1.0.0 \
libgstreamer1.0-0 \
libgstreamer1.0-dev \
gstreamer1.0-tools \
gstreamer1.0-plugins-good \
gstreamer1.0-plugins-bad \
gstreamer1.0-plugins-ugly \
gstreamer1.0-libav \
libgstrtspserver-1.0-0 \
libjansson4
2.直接安装GPU驱动【不建议】
https://wangjunjian.com/gpu/2020/11/03/install-nvidia-gpu-driver-on-ubuntu.html
1)卸载旧显卡驱动
#卸载显卡重装
sudo apt-get remove nvidia* #注意此时千万不能重启,重新电脑可能会导致无法进入系统。
sudo apt-get install autoremove --purge nvidia*
sudo /usr/bin/nvidia-uninstall
2)禁用nouveau
安装nvidia显卡驱动首先需要禁用nouveau,不然会碰到冲突的问题,导致无法安装nvidia显卡驱动。
sudo vim /etc/modprobe.d/blacklist.conf
在末尾加上以下两句
blacklist nouveau
options nouveau modeset=0
重新生成 kernel initramfs:
sudo update-initramfs -u
sudo reboot
检查nouveau是否被禁用,返回空则成功禁用
sudo lsmod | grep nouveau
3)Install NVIDIA driver 460.32
查看显卡型号
$ lspci | grep -i nvidia
5e:00.0 3D controller: NVIDIA Corporation Device 1eb8 (rev a1)
下载通用驱动
注意:安装专门T4驱动没有cuda 11.1版本,会导致后续安装失败
默认附带安装CUDA Version: 11.2
$ chmod 755 NVIDIA-Linux-x86_64-460.32.03.run
$ ./NVIDIA-Linux-x86_64-460.32.03.run --no-opengl-files --no-x-check --no-nouveau-check
# sudo ./NVIDIA-Linux-x86_64-440.64.00.run --no-opengl-files --no-x-check --no-nouveau-check --kernel-source-path=/usr/src/linux-headers-5.4.0-65-generic
显示驱动信息
$ nvidia-smi -L
GPU 0: Tesla T4 (UUID: GPU-158692f4-a7b8-dbe6-3376-65b22b16068d)
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:5E:00.0 Off | Off |
| N/A 52C P0 22W / 70W | 0MiB / 16127MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
3)卸载NVDIA驱动
sudo ./NVIDIA-Linux-x86_64-460.32.03.run --uninstall
3.安装CUDA 11.1和NVDIA驱动【推荐】
任然需要禁用nouveau
下载cuda_11.1并安装
安装cuda同时也会安装NVDIA驱动
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.1.1/local_installers/cuda-repo-ubuntu1804-11-1-local_11.1.1-455.32.00-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-1-local_11.1.1-455.32.00-1_amd64.deb
sudo apt-key add /var/cuda-repo-ubuntu1804-11-1-local/7fa2af80.pub
sudo apt-get update
sudo apt-get -y install cuda
# 安装路径
ls /usr/local/cuda-11.1/
检验驱动是否安装成功,重装cuda会自动覆盖旧版本
$ nvidia-smi -L
GPU 0: Tesla T4 (UUID: GPU-158692f4-a7b8-dbe6-3376-65b22b16068d)
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00 Driver Version: 455.32.00 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:5E:00.0 Off | Off |
| N/A 58C P0 30W / 70W | 0MiB / 16127MiB | 4% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
配置cuda环境变量
sudo vim ~/.bashrc
# 在末尾加入
export PATH=/usr/local/cuda-11.1/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.1/targets/x86_64-linux/lib/:/usr/local/cuda-11.1/targets/x86_64-linux/lib/stubs:$LD_LIBRARY_PATH
source ~/.bashrc
# 查看cuda版本
nvcc --version
卸载cuda
cuda-uninstaller
4.安装TensorRT 7.2.2
https://developer.nvidia.com/nvidia-tensorrt-download
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-723/install-guide/index.html#installing-debian
sudo dpkg -i nv-tensorrt-repo-ubuntu1804-cuda11.1-trt7.2.2.3-ga-20201211_1-1_amd64.deb
sudo apt-key add /var/nv-tensorrt-repo-cuda11.1-trt7.2.2.3-ga-20201211/7fa2af80.pub
sudo apt-get update
sudo apt-get install tensorrt
# If using Python 2.7:
sudo apt-get install python-libnvinfer-dev
# If using Python 3.x:
sudo apt-get install python3-libnvinfer-dev
5.安装cudnn
https://developer.nvidia.com/rdp/cudnn-archive#a-collapse811-111
#------------------安装cudnn8.1-------------------------------------
tar -xzvf cudnn-11.2-linux-x64-v8.1.1.33.tgz
sudo cp cuda/include/* /usr/local/cuda/include/
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/
sudo chmod a+r /usr/local/cuda/include/cudnn.h
sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
6.安装DeepStream SDK
$ wget https://developer.nvidia.com/deepstream-51-510-1-amd64deb
$ sudo apt-get install ./deepstream-5.1_5.1.0-1_amd64.deb
$ deepstream-app --version-all
(gst-plugin-scanner:27728): GStreamer-WARNING **: 17:12:39.280: Failed to load plugin '/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_inferserver.so': libtritonserver.so: cannot open shared object file: No such file or directory
deepstream-app version 5.1.0
DeepStreamSDK 5.1.0
CUDA Driver Version: 11.1
CUDA Runtime Version: 11.1
TensorRT Version: 7.2
cuDNN Version: 8.1
libNVWarp360 Version: 2.0.1d3
# 备注:这是一个无害的警告,表示nvinferserver无法使用DeepStream 的插件,因为 x86(dGPU) 平台上未安装“Triton Inference Server”。
nvds插件
$ gst-inspect-1.0 --plugin|grep nvds
nvdsgst_infer: nvinfer: NvInfer plugin
nvdsgst_ofvisual: nvofvisual: nvofvisual
nvdsgst_osd: nvdsosd: NvDsOsd plugin
nvdsgst_inferaudio: nvinferaudio: NvInfer Audio plugin
nvdsgst_dsanalytics: nvdsanalytics: DsAnalytics plugin
nvdsgst_jpegdec: nvjpegdec: JPEG image decoder
nvdsgst_audiotemplate: nvdsaudiotemplate: DS AUDIO template Plugin for Transform IP use-cases
nvdsgst_msgbroker: nvmsgbroker: Message Broker
nvdsgst_dewarper: nvdewarper: nvdewarper
nvdsgst_multistream: nvstreamdemux: Stream demultiplexer
nvdsgst_multistream: nvstreammux: Stream multiplexer
nvdsgst_multistreamtiler: nvmultistreamtiler: Stream Tiler DS
nvdsgst_tracker: nvtracker: NvTracker plugin
nvdsgst_videotemplate: nvdsvideotemplate: NvDsVideoTemplate plugin for Transform/In-Place use-cases
nvdsgst_segvisual: nvsegvisual: nvsegvisual
nvdsgst_of: nvof: nvof
nvdsgst_eglglessink: nveglglessink: EGL/GLES vout Sink
nvdsgst_dsexample: dsexample: DsExample plugin
nvdsgst_msgconv: nvmsgconv: Message Converter
nvvideoconvert: nvvideoconvert: NvVidConv Plugin
nvvideo4linux2: nvv4l2h265enc: V4L2 H.265 Encoder
nvvideo4linux2: nvv4l2h264enc: V4L2 H.264 Encoder
nvvideo4linux2: nvv4l2decoder: NVIDIA v4l2 video decoder
$ gst-inspect-1.0 nvdsgst_infer
Plugin Details:
Name nvdsgst_infer
Description NVIDIA DeepStreamSDK TensorRT plugin
Filename /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_infer.so
Version 5.1.0
License Proprietary
Source module nvinfer
Binary package NVIDIA DeepStreamSDK TensorRT plugin
Origin URL http://nvidia.com/
nvinfer: NvInfer plugin
1 features:
+-- 1 elements
$ tree /opt/nvidia/deepstream/deepstream-5.1 -L 2
/opt/nvidia/deepstream/deepstream-5.1
├── bin # 测试二进制程序
│ ├── deepstream-app
│ ├── deepstream-appsrc-test
│ ├── deepstream-audio
│ ├── deepstream-dewarper-app
│ ├── deepstream-gst-metadata-app
│ ├── deepstream-image-decode-app
│ ├── deepstream-image-meta-test
│ ├── deepstream-infer-tensor-meta-app
│ ├── deepstream-mrcnn-app
│ ├── deepstream-nvdsanalytics-test
│ ├── deepstream-nvof-app
│ ├── deepstream-opencv-test
│ ├── deepstream-perf-demo
│ ├── deepstream-segmentation-app
│ ├── deepstream-test1-app
│ ├── deepstream-test2-app
│ ├── deepstream-test3-app
│ ├── deepstream-test4-app
│ ├── deepstream-test5-app
│ ├── deepstream-testsr-app
│ ├── deepstream-transfer-learning-app
│ └── deepstream-user-metadata-app
├── install.sh
├── lib # C++ 插件动态库
│ ├── gst-plugins
│ │ ├── libgstnvvideo4linux2.so
│ │ ├── libgstnvvideoconvert.so
│ │ ├── libnvdsgst_audiotemplate.so
│ │ ├── libnvdsgst_dewarper.so
│ │ ├── libnvdsgst_dsanalytics.so
│ │ ├── libnvdsgst_dsexample.so
│ │ ├── libnvdsgst_eglglessink.so
│ │ ├── libnvdsgst_inferaudio.so
│ │ ├── libnvdsgst_inferserver.so
│ │ ├── libnvdsgst_infer.so
│ │ ├── libnvdsgst_jpegdec.so
│ │ ├── libnvdsgst_msgbroker.so
│ │ ├── libnvdsgst_msgconv.so
│ │ ├── libnvdsgst_multistream_2.a
│ │ ├── libnvdsgst_multistream.so
│ │ ├── libnvdsgst_multistreamtiler.so
│ │ ├── libnvdsgst_of.so
│ │ ├── libnvdsgst_ofvisual.so
│ │ ├── libnvdsgst_osd.so
│ │ ├── libnvdsgst_segvisual.so
│ │ ├── libnvdsgst_tracker.so
│ │ └── libnvdsgst_videotemplate.so
│ ├── libcuvidv4l2.so
│ ├── libiothub_client.so
│ ├── libiothub_client.so.1 -> /opt/nvidia/deepstream/deepstream-5.1/lib/libiothub_client.so
│ ├── libnvbuf_fdmap.so
│ ├── libnvbufsurface.so
│ ├── libnvbufsurftransform.so
│ ├── libnvds_amqp_proto.so
│ ├── libnvds_audiotransform.so
│ ├── libnvds_azure_edge_proto.so
│ ├── libnvds_azure_proto.so
│ ├── libnvds_batch_jpegenc.so
│ ├── libnvdsbufferpool.so
│ ├── libnvds_csvparser.so
│ ├── libnvds_dewarper.so
│ ├── libnvds_dsanalytics.so
│ ├── libnvdsgst_audio.so
│ ├── libnvdsgst_bufferpool.so
│ ├── libnvdsgst_helper.so
│ ├── libnvdsgst_inferbase.so
│ ├── libnvdsgst_meta.so
│ ├── libnvdsgst_smartrecord.so
│ ├── libnvdsgst_tensor.so
│ ├── libnvdsinfer_custom_impl_fasterRCNN.so
│ ├── libnvdsinfer_custom_impl_ssd.so
│ ├── libnvdsinfer_custom_impl_Yolo.so
│ ├── libnvds_infer_custom_parser_audio.so
│ ├── libnvds_infercustomparser.so
│ ├── libnvds_infer_server.so
│ ├── libnvds_infer.so
│ ├── libnvds_inferutils.so
│ ├── libnvds_kafka_proto.so
│ ├── libnvds_lljpegdec.so
│ ├── libnvds_logger.so
│ ├── libnvds_meta.so
│ ├── libnvds_mot_iou.so
│ ├── libnvds_mot_klt.so
│ ├── libnvds_msgbroker.so
│ ├── libnvds_msgconv_audio.so
│ ├── libnvds_msgconv.so
│ ├── libnvds_nvdcf.so
│ ├── libnvds_nvtxhelper.so
│ ├── libnvds_opticalflow_dgpu.so
│ ├── libnvds_osd.so
│ ├── libnvds_redis_proto.so
│ ├── libnvds_tracker.so
│ ├── libnvds_utils.so
│ ├── libnvv4l2.so
│ ├── libnvv4lconvert.so
│ ├── libnvvpi.so.1 -> /opt/nvidia/deepstream/deepstream-5.1/lib/libnvvpi.so.1.0.12
│ ├── libnvvpi.so.1.0.12
│ ├── libv4l
│ ├── pkg-config
│ ├── pyds.so
│ └── setup.py
├── LicenseAgreement.pdf
├── LICENSE.txt
├── README
├── README.rhel
├── samples # 示例应用程序
│ ├── configs # 示例配置文件
│ │ ├── deepstream-app
│ │ │ ├── config_infer_primary_endv.txt
│ │ │ ├── config_infer_primary_nano.txt # 在nano上将 nvinfer 元素配置为主要检测器
│ │ │ ├── config_infer_primary.txt # 将 nvinfer 元素配置为主要检测器
│ │ │ ├── config_infer_secondary_carcolor.txt # 将 nvinfer元素配置为辅助分类器
│ │ │ ├── config_infer_secondary_carmake.txt
│ │ │ ├── config_infer_secondary_vehicletypes.txt
│ │ │ ├── config_mux_source30.txt
│ │ │ ├── config_mux_source4.txt
│ │ │ ├── iou_config.txt
│ │ │ ├── source1_usb_dec_infer_resnet_int8.txt # 一个USB摄像机作为输入
│ │ │ ├── source30_1080p_dec_infer-resnet_tiled_display_int8.txt # 30路1080P视频输入解码、推理、显示
│ │ │ ├── source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8_gpu1.txt # 在gpu1上4路1080P视频输入解码、推理、跟踪、显示
│ │ │ ├── source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.txt # 4路1080P视频输入解码、推理、跟踪、显示
│ │ │ └── tracker_config.yml
│ │ ├── deepstream-app-trtis
│ │ └── tlt_pretrained_models # tlt预训练模型配置文件,还行需要下载对应模型到 samples/models/tlt_pretrained_models路径
│ │ ├── config_infer_primary_dashcamnet.txt
│ │ ├── config_infer_primary_detectnet_v2.txt
│ │ ├── config_infer_primary_dssd.txt
│ │ ├── config_infer_primary_facedetectir.txt
│ │ ├── config_infer_primary_frcnn.txt
│ │ ├── config_infer_primary_mrcnn.txt
│ │ ├── config_infer_primary_peoplenet.txt
│ │ ├── config_infer_primary_retinanet.txt
│ │ ├── config_infer_primary_ssd.txt
│ │ ├── config_infer_primary_trafficcamnet.txt
│ │ ├── config_infer_primary_yolov3.txt
│ │ ├── config_infer_secondary_vehiclemakenet.txt
│ │ ├── config_infer_secondary_vehicletypenet.txt
│ │ ├── deepstream_app_source1_dashcamnet_vehiclemakenet_vehicletypenet.txt
│ │ ├── deepstream_app_source1_detection_models.txt
│ │ ├── deepstream_app_source1_facedetectir.txt
│ │ ├── deepstream_app_source1_mrcnn.txt
│ │ ├── deepstream_app_source1_peoplenet.txt
│ │ ├── deepstream_app_source1_trafficcamnet.txt
│ │ ├── detectnet_v2_labels.txt
│ │ ├── dssd_labels.txt
│ │ ├── frcnn_labels.txt
│ │ ├── labels_dashcamnet.txt
│ │ ├── labels_facedetectir.txt
│ │ ├── labels_peoplenet.txt
│ │ ├── labels_trafficnet.txt
│ │ ├── labels_vehiclemakenet.txt
│ │ ├── labels_vehicletypenet.txt
│ │ ├── mrcnn_labels.txt
│ │ ├── README
│ │ ├── retinanet_labels.txt
│ │ ├── ssd_labels.txt
│ │ └── yolov3_labels.txt
│ ├── models # 示例模型
│ │ ├── Primary_Detector # 一级检测器
│ │ │ ├── cal_trt.bin
│ │ │ ├── labels.txt
│ │ │ ├── resnet10.caffemodel
│ │ │ └── resnet10.prototxt
│ │ ├── Primary_Detector_Nano # 一级检测器,适用于nano
│ │ │ ├── labels.txt
│ │ │ ├── resnet10.caffemodel
│ │ │ └── resnet10.prototxt
│ │ ├── Secondary_CarColor # 二级检测器,车辆颜色分类
│ │ │ ├── cal_trt.bin
│ │ │ ├── labels.txt
│ │ │ ├── mean.ppm
│ │ │ ├── resnet18.caffemodel
│ │ │ └── resnet18.prototxt
│ │ ├── Secondary_CarMake
│ │ │ ├── cal_trt.bin
│ │ │ ├── labels.txt
│ │ │ ├── mean.ppm
│ │ │ ├── resnet18.caffemodel
│ │ │ └── resnet18.prototxt
│ │ ├── Secondary_VehicleTypes # 二级检测器,车辆种类分类
│ │ │ ├── cal_trt.bin
│ │ │ ├── labels.txt
│ │ │ ├── mean.ppm
│ │ │ ├── resnet18.caffemodel
│ │ │ └── resnet18.prototxt
│ │ ├── Segmentation # 分割模型
│ │ │ ├── industrial
│ │ │ └── semantic
│ │ └── SONYC_Audio_Classifier
│ │ ├── audio_labels_car.txt
│ │ ├── audio_labels.txt
│ │ ├── sonyc_audio_classifier.onxx
│ │ ├── sonyc_audio_classify_car.onxx
│ │ └── sonyc_audio_classify.onnx
│ ├── prepare_classification_test_video.sh
│ ├── prepare_ds_trtis_model_repo.sh
│ ├── streams # 流媒体文件
│ │ ├── sample_1080p_h264.mp4
│ │ ├── sample_1080p_h265.mp4
│ │ ├── sample_720p.h264
│ │ ├── sample_720p.jpg
│ │ ├── sample_720p.mjpeg
│ │ ├── sample_720p.mp4
│ │ ├── sample_cam5.mp4
│ │ ├── sample_cam6.mp4
│ │ ├── sample_industrial.jpg
│ │ ├── sample_qHD.h264
│ │ ├── sample_qHD.mp4
│ │ ├── sonyc_mixed_audio.wav
│ │ ├── yoga.jpg
│ │ └── yoga.mp4
│ └── trtis_model_repo
├── sources
│ ├── apps # deepstream-app的测试代码
│ │ ├── apps-common
│ │ │ ├── includes
│ │ │ └── src
│ │ └── sample_apps # 示例应用程序
│ │ ├── deepstream-app
│ │ ├── deepstream-appsrc-test
│ │ ├── deepstream-audio
│ │ ├── deepstream-dewarper-test
│ │ ├── deepstream-gst-metadata-test
│ │ ├── deepstream-image-decode-test
│ │ ├── deepstream-image-meta-test
│ │ ├── deepstream-infer-tensor-meta-test
│ │ ├── deepstream-mrcnn-app
│ │ ├── deepstream-nvdsanalytics-test
│ │ ├── deepstream-nvof-test
│ │ ├── deepstream-opencv-test
│ │ ├── deepstream-perf-demo
│ │ ├── deepstream-segmentation-test
│ │ ├── deepstream-test1
│ │ ├── deepstream-test2
│ │ ├── deepstream-test3
│ │ ├── deepstream-test4
│ │ ├── deepstream-test5
│ │ ├── deepstream-testsr
│ │ ├── deepstream-transfer-learning-app
│ │ └── deepstream-user-metadata-test
│ ├── gst-plugins # nvidia gst 插件程序
│ │ ├── gst-dsexample
│ │ ├── gst-nvdsaudiotemplate
│ │ ├── gst-nvdsosd
│ │ ├── gst-nvdsvideotemplate
│ │ ├── gst-nvinfer # TensorRT的推理插件
│ │ ├── gst-nvmsgbroker
│ │ └── gst-nvmsgconv
│ ├── includes
│ ├── libs # gst-plugins依赖库
│ │ ├── amqp_protocol_adaptor
│ │ ├── azure_protocol_adaptor
│ │ ├── kafka_protocol_adaptor
│ │ ├── nvdsinfer # NvDsInfer library
│ │ ├── nvdsinfer_customparser # 自定义模型的边界框解析函数模板
│ │ ├── nvmsgbroker
│ │ ├── nvmsgconv
│ │ ├── nvmsgconv_audio
│ │ └── redis_protocol_adaptor
│ ├── objectDetector_FasterRCNN # Faster RCNN detector
│ ├── objectDetector_SSD # UFF SSD detector
│ ├── objectDetector_Yolo # Yolo detector
│ ├── SONYCAudioClassifier
│ └── tools
├── uninstall.sh
└── version
7.安装tao-converter
tao-converter用于将tlt预训练模型文件(.etlt) 转换为TensorRT engine 文件 (.engine) ,可以优化模型速度、精度和稳定性。
下载对应版本的tao-converter
# 解压文件
cd ~/Downloads
unzip cuda11.1-trt7.2-20210820T231205Z-001.zip
cd cuda11.1-trt7.2
chmod 777 tao-converter
sudo cp ~/Downloads/cuda11.1-trt7.2/tao-converter /usr/local/bin/
# 设置环境变量,针对root及普通用户都要
sudo vim ~/.bashrc
# 在末尾加入
export TRT_LIB_PATH="/usr/lib/x86_64-linux-gnu"
export TRT_INC_PATH="/usr/include/x86_64-linux-gnu"
source ~/.bashrc
tao-converter -h
三、Sample Apps Source
1. C/C++ Sample Apps Source
The sources directory is located at /opt/nvidia/deepstream/deepstream-5.1/sources
DeepStream C/C++ API
sudo apt-get install libgstreamer-plugins-base1.0-dev libgstreamer1.0-dev \
libgstrtspserver-1.0-dev libx11-dev libjson-glib-dev
1)deepstream-app
鉴于通过DS的底层C接口和Python接口构建检测流水线仍有些繁琐,Nvidia针对最常见的深度学习模型处理流程提炼并设计了参考程序deepstream-app,该程序允许用户通过传入配置文件描述检测流水线,deepstream-app会根据配置文件的描述调用相应DS插件,构建流水线。因此,虽然deepstream-app是个参考程序,但常被当做DS的CLI工具使用。
deepstream-app的运作流程
image
前端使用decode插件读入视频流(RTSP、文件、usb摄像头等),多个摄像头经过MUX进行合并,组成batch,送入主检测器(目标检测)获得边界框,随后送入tracker进行跟踪,每个跟踪的边界框继续送入次级检测器(一般是分类器),检测结果发送到tilter形成2D 帧图像(平铺),进而用osd插件使用生成的元数据在合成帧上绘制阴影框、矩形和文本,最后输出结果(sink)。
结果输出方式如下:
Fakesink
EGL based windowed sink (nveglglessink)
Encode + File Save (encoder + muxer + filesink)
Encode + RTSP streaming
Overlay (Jetson only)
Message converter + Message broker
$ deepstream-app --help-all
Usage:
deepstream-app [OPTION?] Nvidia DeepStream Demo
Help Options:
-h, --help Show help options
--help-all Show all help options
--help-gst Show GStreamer Options
GStreamer Options
--gst-version Print the GStreamer version
--gst-fatal-warnings Make all warnings fatal
--gst-debug-help Print available debug categories and exit
--gst-debug-level=LEVEL Default debug level from 1 (only error) to 9 (anything) or 0 for no output
--gst-debug=LIST Comma-separated list of category_name:level pairs to set specific levels for the individual categories. Example: GST_AUTOPLUG:5,GST_ELEMENT_*:3
--gst-debug-no-color Disable colored debugging output
--gst-debug-color-mode Changes coloring mode of the debug log. Possible modes: off, on, disable, auto, unix
--gst-debug-disable Disable debugging
--gst-plugin-spew Enable verbose plugin loading diagnostics
--gst-plugin-path=PATHS Colon-separated paths containing plugins
--gst-plugin-load=PLUGINS Comma-separated list of plugins to preload in addition to the list stored in environment variable GST_PLUGIN_PATH
--gst-disable-segtrap Disable trapping of segmentation faults during plugin loading
--gst-disable-registry-update Disable updating the registry
--gst-disable-registry-fork Disable spawning a helper process while scanning the registry
Application Options:
-v, --version Print DeepStreamSDK version
-t, --tiledtext Display Bounding box labels in tiled mode
--version-all Print DeepStreamSDK and dependencies version
-c, --cfg-file Set the config file
-i, --input-file Set the input file
deepstream-app的配置文件使用密钥文件格式,基于 freedesktop 规范
deepstream-app的配置文件有如下可选的配置组。
2)示例应用说明
Sample source details
Sample Configurations and Streams
Path inside sources directory
Description
apps/sample_apps/deepstream-test1
Sample of how to use DeepStream elements for a single H.264 stream: filesrc → decode → nvstreammux → nvinfer (primary detector) → nvdsosd → renderer.
apps/sample_apps/deepstream-test2
Sample of how to use DeepStream elements for a single H.264 stream: filesrc → decode → nvstreammux → nvinfer (primary detector) → nvtracker → nvinfer (secondary classifier) → nvdsosd → renderer.
apps/sample_apps/deepstream-test3
Builds on deepstream-test1 (simple test application 1) to demonstrate how to:Use multiple sources in the pipelineUse a uridecodebin to accept any type of input (e.g. RTSP/File), any GStreamer supported container format, and any codecConfigure Gst-nvstreammux to generate a batch of frames and infer on it for better resource utilizationExtract the stream metadata, which contains useful information about the frames in the batched buffer
apps/sample_apps/deepstream-test4
Builds on deepstream-test1 for a single H.264 stream: filesrc, decode, nvstreammux, nvinfer, nvdsosd, renderer to demonstrate how to:Use the Gst-nvmsgconv and Gst-nvmsgbroker plugins in the pipelineCreate NVDS_META_EVENT_MSG type metadata and attach it to the bufferUse NVDS_META_EVENT_MSG for different types of objects, e.g. vehicle and personImplement “copy” and “free” functions for use if metadata is extended through the extMsg field
apps/sample_apps/deepstream-test5
Builds on top of deepstream-app. Demonstrates:Use of Gst-nvmsgconv and Gst-nvmsgbroker plugins in the pipeline for multistreamHow to configure Gst-nvmsgbroker plugin from the config file as a sink plugin (for KAFKA, Azure, etc.)How to handle the RTCP sender reports from RTSP servers or cameras and translate the Gst Buffer PTS to a UTC timestamp.For more details refer the RTCP Sender Report callback function test5_rtcp_sender_report_callback()
registration and usage in deepstream_test5_app_main.c
. GStreamer callback registration with rtpmanager element’s “handle-sync” signal is documented in apps-common/src/deepstream_source_bin.c
.
libs/amqp_protocol_adaptor
Application to test AMQP protocol.
libs/azure_protocol_adaptor
Test application to show Azure IoT device2edge messaging and device2cloud messaging using MQTT.
apps/sample_apps/deepstream-app
Source code for the DeepStream reference application.
sources/objectDetector_SSD
Configuration files and custom library implementation for the SSD detector model.
sources/objectDetector_FasterRCNN
Configuration files and custom library implementation for the FasterRCNN model.
sources/objectDetector_Yolo
Configuration files and custom library implementation for the Yolo models, currently Yolo v2, v2 tiny, v3, and v3 tiny.
apps/sample_apps/deepstream-dewarper-test
Demonstrates dewarper functionality for single or multiple 360-degree camera streams. Reads camera calibration parameters from a CSV file and renders aisle and spot surfaces on the display.
apps/sample_apps/deepstream-nvof-test
Demonstrates optical flow functionality for single or multiple streams. This example uses two GStreamer plugins (Gst-nvof and Gst-nvofvisual). The Gst-nvof element generates the MV (motion vector) data and attaches it as user metadata. The Gst-nvofvisual element visualizes the MV data using a predefined color wheel matrix.
apps/sample_apps/deepstream-user-metadata-test
Demonstrates how to add custom or user-specific metadata to any component of DeepStream. The test code attaches a 16-byte array filled with user data to the chosen component. The data is retrieved in another component.
apps/sample_apps/deepstream-image-decode-test
Builds on deepstream-test3 to demonstrate image decoding instead of video. This example uses a custom decode bin so the MJPEG codec can be used as input.
apps/sample_apps/deepstream-segmentation-test
Demonstrates segmentation of multi-stream video or images using a semantic or industrial neural network and rendering output to a display.
apps/sample_apps/deepstream-gst-metadata-test
Demonstrates how to set metadata before the Gst-nvstreammux plugin in the DeepStream pipeline, and how to access it after Gst-nvstreammux.
apps/sample_apps/deepstream-infer-tensor-meta-app
Demonstrates how to flow and access nvinfer tensor output as metadata.
apps/sample_apps/deepstream-perf-demo
Performs single channel cascaded inferencing and object tracking sequentially on all streams in a directory.
apps/sample_apps/deepstream-nvdsanalytics-test
Demonstrates batched analytics like ROI filtering, Line crossing, direction detection and overcrowding
apps/sample_apps/deepstream-opencv-test
Demonstrates the use of OpenCV in dsexample plugin
Apps/sample_apps / deepstream-image-meta-test
Demonstrates how to attach encoded image as meta data and save the images in jpeg format.
apps/sample_apps/deepstream-appsrc-test
Demonstrates AppSrc and AppSink usage for consuming and giving data from non DeepStream code respectively.
apps/sample_apps/ deepstream-transfer-learning-app
Demonstrates a mechanism to save the images for objects which have lesser confidence and the same can be used for training further
apps/sample_apps/ deepstream-mrcnn-test
Demonstrates Instance segmentation using Mask-RCNN model
apps/sample_apps/deepstream-audio
Source code for the DeepStream reference application demonstrating audio analytics pipeline.
apps/sample_apps/deepstream-testsr
Demonstrates event based smart record functionality
3)sample测试
$ cd /opt/nvidia/deepstream/deepstream-5.1/sources/apps/sample_apps/deepstream-test1
$ deepstream-test1-app /opt/nvidia/deepstream/deepstream-5.1/samples/streams/sample_720p.h264
2.Python Sample Apps Source
提供有关使用 Python 开发 DeepStream 应用程序的信息。
Python 绑定包含在 DeepStream 5.1 SDK 中,示例应用程序可在此处获得。
1)Python bindings
DeepStream 通过 Python bindings支持 C/C++ 和 Python 的应用程序开发。DeepStream管道可以使用Gst-Python构建(基于GStreamer 框架的 Python bindings)。为了访问 DeepStream 元数据,Python bindings以编译模块的形式提供,该模块包含在 DeepStream SDK 中。
DeepStream Python API
[图片上传失败...(image-d25753-1639625107443)]
DeepStream Python 应用程序使用Gst-Python API 操作构建管道,并使用探头功能在管道中的不同点访问数据。数据类型均以native C 为母体,需要通过PyBindings或 NumPy 为垫板才能从 Python 应用程序访问它们。Tensor数据是推理后出的原始Tensor输出,如果试图检测对象,此张力数据需要通过解析和聚类算法进行后处理,以便在检测到的对象周围创建边界框。
Python bindings安装在/opt/nvidia/deepstream/deepstream/lib/pyds.so
2)运行 Sample Applications
$ sudo apt install python3-gi python3-dev python3-gst-1.0 -y
$ sudo apt install python3-opencv
$ sudo apt install python3-numpy
$ sudo apt install libgstrtspserver-1.0-0 gstreamer1.0-rtsp
$ sudo apt install libgirepository1.0-dev
$ sudo apt install gobject-introspection gir1.2-gst-rtsp-server-1.0
下载sample源码
$ cd /opt/nvidia/deepstream/deepstream-5.1/sources # 必须,模型使用相对路径
$ git clone https://github.com/NVIDIA-AI-IOT/deepstream_python_apps
测试sample
https://blog.csdn.net/leida_wt/article/details/113368272
$ cd /opt/nvidia/deepstream/deepstream-5.1/sources/deepstream_python_apps/apps/deepstream-test1-rtsp-out
$ python3 deepstream_test1_rtsp_out.py -i ../../../../samples/streams/sample_720p.h264
Creating Pipeline
Creating Source
Creating H264Parser
Creating Decoder
Creating H264 Encoder
Creating H264 rtppay
Playing file ../../../../samples/streams/sample_720p.h264
Adding elements to Pipeline
Linking elements in the Pipeline
*** DeepStream: Launched RTSP Streaming at rtsp://localhost:8554/ds-test ***
Starting pipeline
0:00:01.610084345 4075 0x31e5c00 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1702> [UID = 1]: deserialized trt engine from :/opt/nvidia/deepstream/deepstream-5.1/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine
INFO: ../nvdsinfer/nvdsinfer_model_builder.cpp:685 [Implicit Engine Info]: layers num: 3
0 INPUT kFLOAT input_1 3x368x640
1 OUTPUT kFLOAT conv2d_bbox 16x23x40
2 OUTPUT kFLOAT conv2d_cov/Sigmoid 4x23x40
0:00:01.610191478 4075 0x31e5c00 INFO nvinfer gstnvinfer.cpp:619:gst_nvinfer_logger:<primary-inference> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:1806> [UID = 1]: Use deserialized engine model: /opt/nvidia/deepstream/deepstream-5.1/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_int8.engine
0:00:01.611249626 4075 0x31e5c00 INFO nvinfer gstnvinfer_impl.cpp:313:notifyLoadModelStatus:<primary-inference> [UID 1]: Load new model:dstest1_pgie_config.txt sucessfully
Frame Number=0 Number of Objects=6 Vehicle_count=4 Person_count=2
Frame Number=1 Number of Objects=6 Vehicle_count=4 Person_count=2
Frame Number=1441 Number of Objects=0 Vehicle_count=0 Person_count=0
End-of-stream
# 运行期间使用vlc等播放器访问地址
rtsp://<server IP>:8554/ds-test
$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00 Driver Version: 455.32.00 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:5E:00.0 Off | Off |
| N/A 38C P0 27W / 70W | 834MiB / 16127MiB | 10% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 5603 C python3 703MiB |
+-----------------------------------------------------------------------------+
3)示例应用说明
Path inside the GitHub repo
Description
apps/deepstream-test1
Simple example of how to use DeepStream elements for a single H.264 stream: filesrc → decode → nvstreammux → nvinfer (primary detector) → nvdsosd → renderer.
apps/deepstream-test2
Simple example of how to use DeepStream elements for a single H.264 stream: filesrc → decode → nvstreammux → nvinfer (primary detector) → nvtracker → nvinfer (secondary classifier) → nvdsosd → renderer.
apps/deepstream-test3
Builds on deepstream-test1 (simple test application 1) to demonstrate how to:Use multiple sources in the pipelineUse a uridecodebin to accept any type of input (e.g. RTSP/File), any GStreamer supported container format, and any codecConfigure Gst-nvstreammux to generate a batch of frames and infer on it for better resource utilizationExtract the stream metadata, which contains useful information about the frames in the batched buffer
apps/deepstream-test4
Builds on deepstream-test1 for a single H.264 stream: filesrc, decode, nvstreammux, nvinfer, nvdsosd, renderer to demonstrate how to:Use the Gst-nvmsgconv and Gst-nvmsgbroker plugins in the pipelineCreate NVDS_META_EVENT_MSG type metadata and attach it to the bufferUse NVDS_META_EVENT_MSG for different types of objects, e.g. vehicle and personImplement “copy” and “free” functions for use if metadata is extended through the extMsg field
apps/deepstream-test1-usbcam
Simple test application 1 modified to process a single stream from a USB camera.
apps/deepstream-test1-rtsp-out
Simple test application 1 modified to output visualization stream over RTSP.
apps/deepstream-imagedata-multistream
Builds on simple test application 3 to demonstrate how to:Access decoded frames as NumPy arrays in the pipelineCheck detection confidence of detected objects (DBSCAN or NMS clustering required)Modify frames and see the changes reflected downstream in the pipelineUse OpenCV to annotate the frames and save them to file
apps/deepstream-ssd-parser
Demonstrates how to perform custom post-processing for inference output from Triton Inference Server:Use SSD model on Triton Inference Server for object detectionEnable custom post-processing and raw tensor export for Triton Inference Server via configuration file settingsAccess inference output tensors in the pipeline for post-processing in PythonAdd detected objects to the metadataOutput the OSD visualization to MP4 file
apps/deepstream-opticalflow
Demonstrated how to obtain opticalflow meta data and also demonstrates how to:Access optical flow vectors as numpy arrayVisualize optical flow using obtained flow vectors and OpenCV
apps/deepstream-segmentation
Demonstrates how to obtain segmentation meta data and also demonstrates how to:Acess segmentation masks as numpy arrayVisualize segmentation using obtained masks and OpenCV
apps/deepstream-nvdsanalytics
Demonstrates how to use the nvdsanalytics plugin and obtain analytics metadata
3.TLT pre-trained models
https://docs.nvidia.com/tao/tao-toolkit/text/deepstream_tao_integration.html
用户可以从两种类型的预训练模型开始:
Purpose-built pre-trained models:这些是高度准确的模型,经过针对特定任务的数千个数据输入进行训练。这些以领域为中心的模型既可以直接用于推理,也可以与 TAO 工具包一起用于对用户自己的数据集进行迁移学习。
General purpose vision models.:这些模型的预训练权重仅作为构建更复杂模型的起点。对于计算机视觉用例,这些预训练的权重在开放图像数据集上进行训练,与从权重的随机初始化开始相比,它们提供了更好的训练起点。
要将 TAO Toolkit 训练的模型部署到 DeepStream,我们有两种选择:
选项 1:将.etlt
模型直接集成到 DeepStream 应用程序中。模型文件通过导出生成。
直接使用.etlt
文件和校准缓存,DeepStream 会自动生成 TensorRT 引擎文件,然后运行推理。
选项 2:使用tao-converter
. 生成的 TensorRT 引擎文件也可以被 DeepStream 摄取。
# 下载加密的tlt模型文件
cd /opt/nvidia/deepstream/deepstream-5.1/samples/models/
mkdir -p tlt_pretrained_models/peoplenet
cd tlt_pretrained_models/peoplenet
wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplenet/versions/pruned_v2.1/files/resnet34_peoplenet_pruned.etlt
# 运行模型
cd /opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models/
sudo deepstream-app -c deepstream_app_source1_peoplenet.txt
注意:首次执行时,由于primary-gie配置的model-engine-file不存在,deepstream-app运行时会自动将其etlt文件转换为 TensorRT engine 文件 (.engine) 。
示例2:License Plate detection and recognition
License Plate Detection (LPDNet) and Recognition (LPRNet)
DeepStream5.0系列之车牌识别
cd /opt/nvidia/deepstream/deepstream-5.1/sources/
# 下载测试程序
git clone https://github.com/NVIDIA-AI-IOT/deepstream_lpr_app.git
cd deepstream_lpr_app/
# 下载中文车牌模型并转变为 TensorRT engine 文件
./download_ch.sh
# gst-nvinfer cannot generate TRT engine for LPR model, so generate it with tao-converter
tao-converter -k nvidia_tlt -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 \
models/LP/LPR/ch_lprnet_baseline18_deployable.etlt -t fp16 -e models/LP/LPR/lpr_ch_onnx_b16.engine
# 编译程序与插件
cd deepstream-lpr-app
cp dict_ch.txt dict.txt
# 运行测试程序
sudo ./deepstream-lpr-app 2 1 0 ch_car_test.mp4 ch_car_test.mp4 output.264
2)General purpose vision models
https://docs.nvidia.com/tao/tao-toolkit/text/overview.html
基于general purpose models训练而来的Open Model Architectures (TAO models)。
sudo su
cd /opt/nvidia/deepstream/deepstream-5.1/sources/deepstream_tao_apps
git clone -b release/tao3.0 https://github.com/NVIDIA-AI-IOT/deepstream_tao_apps.git
# 编译源码
export CUDA_VER=11.1
# 下载模型
./download_models.sh
tree apps/ -L 1
apps/
├── Makefile
├── tao_classifier
├── tao_detection
├── tao_others
└── tao_segmentation
tree models/ -L 1
models/
├── dssd
├── emotion
├── faciallandmark
├── frcnn
├── gazenet
├── heartrate
├── peopleSegNet
├── peopleSemSegNet
├── retinanet
├── ssd
├── unet
├── yolov3
└── yolov4
# 对于Unet/peopleSemSegNet/yolov3/yolov4 模型不支持deepstream直接装换,需要用tao-convert将.etlt文件转为.engine文件
tao-converter -e models/unet/unet_resnet18.etlt_b1_gpu0_fp16.engine -p input_1,1x3x608x960,1x3x608x960,1x3x608x960 -t fp16 -k tlt_encode -m 1 tlt_encode models/unet/unet_resnet18.etlt
# 测试app与模型
./apps/tao_segmentation/ds-tao-segmentation -h
Usage: ./apps/tao_segmentation/ds-tao-segmentation -c pgie_config_file -i <H264 or JPEG filename> [-b BATCH] [-d]
-h: print help info
-c: pgie config file, e.g. pgie_frcnn_tao_config.txt
-i: H264 or JPEG input file
-b: batch size, this will override the value of "batch-size" in pgie config file
-d: enable display, otherwise dump to output H264 or JPEG file
./apps/tao_segmentation/ds-tao-segmentation -c configs/unet_tao/pgie_unet_tao_config.txt -i ../../samples/streams/sample_720p.h264
./apps/tao_classifier/ds-tao-classifier -c configs/multi_task_tao/pgie_multi_task_tao_config.txt -i ../../samples/streams/sample_720p.h264
SHOW_MASK=1
./apps/tao_detection/ds-tao-detection -c configs/frcnn_tao/pgie_frcnn_tao_config.txt -i ../../samples/streams/sample_720p.h264
备注:模型转换命令和调用参考可以在NGC上查询相关模型信息获得。
四、Docker方式部署与使用
dGPU DeepStream容器列表,来源 https://ngc.nvidia.com
Container
Container pull commands
base docker (contains only the runtime libraries and GStreamer plugins. Can be used as a base to build custom dockers for DeepStream applications)
docker pull nvcr.io/nvidia/deepstream:5.1-21.02-base
devel docker (contains the entire SDK along with a development environment for building DeepStream applications)
docker pull nvcr.io/nvidia/deepstream:5.1-21.02-devel
Triton Inference Server docker with Triton Inference Server and dependencies installed along with a development environment for building DeepStream applications
docker pull nvcr.io/nvidia/deepstream:5.1-21.02-triton
DeepStream IoT docker with deepstream-test5-app installed and all other reference applications removed
docker pull nvcr.io/nvidia/deepstream:5.1-21.02-iot
DeepStream samples docker (contains the runtime libraries, GStreamer plugins, reference applications and sample streams, models and configs)
docker pull nvcr.io/nvidia/deepstream:5.1-21.02-samples
使用示例:容器运行PeopleNet模型
mkdir -p $HOME/peoplenet && \
wget https://api.ngc.nvidia.com/v2/models/nvidia/tao/peoplenet/versions/pruned_v2.1/files/resnet34_peoplenet_pruned.etlt \
-O $HOME/peoplenet/resnet34_peoplenet_pruned.etlt
xhost +
docker run --gpus all -it --rm -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY -v $HOME:/opt/nvidia/deepstream/deepstream-5.1/samples/models/tlt_pretrained_models \
-w /opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models nvcr.io/nvidia/deepstream:5.1-21.02-samples \
deepstream-app -c deepstream_app_source1_peoplenet.txt
# xhost 开放宿主机图形界面的接入权限
–gpus all 指定容器可见的GPU
–device=/dev/video0 将摄像头1映射进入容器
-it -p 8554:8554 映射RTSPStreaming RTSP端口(可选)
-p 5400:5400/udp 映射RTSPStreaming UDP端口(可选)
-v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=:0 连接图形界面到宿主机
-v $HOME:/opt/nvidia/deepstream/deepstream-5.1/samples/models/tlt_pretrained_models 映射宿主机文件夹
-w opt/nvidia/deepstream/deepstream-5.1/samples/configs/tlt_pretrained_models 设置进入容器后开启的路径
五、重要插件说明
1.Gst-nvinfer
UFF file
TLT Encoded Model and Key
Offline: Supports engine files generated by Transfer Learning Toolkit SDK Model converters
Layers: Supports all layers supported by TensorRT, see: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html.
Control parameters
Gst-nvinfer gets control parameters from a configuration file. You can specify this by setting the property config-file-path. For details, see Gst-nvinfer File Configuration Specifications. Other control parameters that can be set through GObject properties are:
Batch size
Inference interval
Attach inference tensor outputs as buffer metadata
Attach instance mask output as in object metadata
The parameters set through the GObject properties override the parameters in the Gst-nvinfer configuration file.
Outputs
Gst Buffer
Depending on network type and configured parameters, one or more of:
NvDsObjectMeta
NvDsClassifierMeta
NvDsInferSegmentationMeta
NvDsInferTensorMeta
2)Gst-nvinfer File Configuration Specifications
The Gst-nvinfer configuration file uses a “Key File” format.
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_plugin_gst-nvinfer.html#inputs-and-outputs
The configuration parameters that you must specify include:
model-file (Caffe model)
proto-file (Caffe model)
uff-file (UFF models)
onnx-file (ONNX models)
model-engine-file, if already generated
int8-calib-file for INT8 mode
mean-file, if required
offsets, if required
maintain-aspect-ratio, if required
parse-bbox-func-name (detectors only) // 对应自定义边界框函数
parse-classifier-func-name (classifiers only) // 对应自定义边界框函数
custom-lib-path // 自定义边界框解析动态库
output-blob-names (Caffe and UFF models)
network-type
model-color-format
process-mode
engine-create-func-name
infer-dims (UFF models)
uff-input-order (UFF models)
2.Gst-nvmsgconv 与 Gst-nvmsgbroker
1)Gst-nvmsgconv
The Gst-nvmsgconv plugin parses NVDS_EVENT_MSG_META
(NvDsEventMsgMeta) type metadata attached to the buffer as user metadata of frame meta and generates the schema payload ( full or minimal).
Full DeepStream schema: generate the payload in JSON format, supports elaborate semantics for object detection, analytics modules, events, location, and sensor.
2)Gst-nvmsgbroker
This plugin sends payload messages to the server using a specified communication protocol.
It accepts any buffer that has NvDsPayload
metadata attached and uses the nvds_msgapi_*
interface to send the messages to the server. You must implement the nvds_msgapi_*
interface for the protocol to be used and specify the implementing library in the proto-lib property.
git clone -b v0.8.0 --recursive https://github.com/alanxz/rabbitmq-c.git
cd rabbitmq-c/
mkdir build && cd build
cmake ..
cmake --build .
sudo cp ./librabbitmq/librabbitmq.so.4 /usr/lib/
sudo apt-get install libglib2.0 libglib2.0-dev
运行AMQP broker(rabbitmq)
通过容器方式运行,启动容器后,可以浏览器中访问http://localhost:15672来查看控制台信息。
RabbitMQ
默认的用户名:guest
,密码:guest
$ sudo docker pull rabbitmq:management
$ sudo docker run --name rabbitmq -d -p 15672:15672 -p 5672:5672 rabbitmq:management
# d 后台运行容器;
# --name 指定容器名;
# -p 指定服务运行的端口(5672:应用访问端口;15672:控制台Web端口号);
运行示例程序deepstream-test4
$ cd /opt/nvidia/deepstream/deepstream-5.1/sources/apps/sample_apps/deepstream-test4
$ deepstream-test4-app -h
Usage:
deepstream-test4-app [OPTION?] Nvidia DeepStream Test4
Help Options:
-h, --help Show help options
--help-all Show all help options
--help-gst Show GStreamer Options
Application Options:
-c, --cfg-file Set the adaptor config file. Optional if connection string has relevant details.
-i, --input-file Set the input H264 file
-t, --topic Name of message topic. Optional if it is part of connection string or config file.
--conn-str Connection string of backend server. Optional if it is part of config file.
-p, --proto-lib Absolute path of adaptor library
-s, --schema Type of message schema (0=Full, 1=minimal), default=0
--no-display Disable display
$ deepstream-test4-app -i /opt/nvidia/deepstream/deepstream-5.1/samples/streams/sample_720p.h264 -p /opt/nvidia/deepstream/deepstream/lib/libnvds_amqp_proto.so -c cfg_amqp.txt -t dstopic -s 1 --no-display
# 指定示例topic name为dstopic
# -s 指定数据格式
配置文件cfg_amqp.txt
示例程序作为生产端输出topic
[message-broker]
password = guest
hostname = localhost
username = guest
port = 5672
exchange = amq.topic
topic = topicname
#share-connection = 1
在rabbitmq创建queue ”dsqueue“并关联exchange ”amq.topic“
func main() {
// 2.connect to RabbitMQ server
conn, err := amqp.Dial("amqp://guest:guest@10.53.3.189:5672/")
failOnError(err, "Failed to connect to RabbitMQ")
defer conn.Close()
// 3.create a channel
ch, err := conn.Channel()
failOnError(err, "Failed to open a channel")
defer ch.Close()
//5. 接收(消费)队列消息
msgs, err := ch.Consume(
"dsqueue", // queue
"", // consumer
true, // auto-ack
false, // exclusive
false, // no-local
false, // no-wait
nil, // args
failOnError(err, "Failed to register a consumer")
forever := make(chan bool)
go func() {
//6. 打印队列消息的内容
for d := range msgs {
log.Printf("Received a message: %s", d.Body)
log.Printf(" [*] Waiting for messages. To exit press CTRL+C")
<-forever
$ ./rabbitmq-consumer
# full schema
2021/10/22 16:47:34 Received a message: {
"messageid" : "a6b0fe29-bc4c-4c1f-b10c-9425eaeaf452",
"mdsversion" : "1.0",
"@timestamp" : "2021-10-22T08:45:59.281Z",
"place" : {
"id" : "1",
"name" : "XYZ",
"type" : "garage",
"location" : {
"lat" : 30.32,
"lon" : -40.549999999999997,
"alt" : 100.0
"aisle" : {
"id" : "walsh",
"name" : "lane1",
"level" : "P2",
"coordinate" : {
"x" : 1.0,
"y" : 2.0,
"z" : 3.0
"sensor" : {
"id" : "CAMERA_ID",
"type" : "Camera",
"description" : "\"Entrance of Garage Right Lane\"",
"location" : {
"lat" : 45.293701446999997,
"lon" : -75.830391449900006,
"alt" : 48.155747933800001
"coordinate" : {
"x" : 5.2000000000000002,
"y" : 10.1,
"z" : 11.199999999999999
"analyticsModule" : {
"id" : "XYZ",
"description" : "\"Vehicle Detection and License Plate Recognition\"",
"source" : "OpenALR",
"version" : "1.0"
"object" : {
"id" : "-1",
"speed" : 0.0,
"direction" : 0.0,
"orientation" : 0.0,
"vehicle" : {
"type" : "sedan",
"make" : "Bugatti",
"model" : "M",
"color" : "blue",
"licenseState" : "CA",
"license" : "XX1234",
"confidence" : -0.10000000149011612
"bbox" : {
"topleftx" : 1173,
"toplefty" : 481,
"bottomrightx" : 1227,
"bottomrighty" : 504
"location" : {
"lat" : 0.0,
"lon" : 0.0,
"alt" : 0.0
"coordinate" : {
"x" : 0.0,
"y" : 0.0,
"z" : 0.0
"event" : {
"id" : "72931242-c039-4f90-b786-bdb296863c23",
"type" : "moving"
"videoPath" : ""
# minimal schema
2021/10/22 16:56:13 Received a message: {
"version" : "4.0",
"id" : 180,
"@timestamp" : "2021-10-22T08:56:13.744Z",
"sensorId" : "sensor-0",
"objects" : [
"-1|1176|475.435|1260|516.522|Vehicle|#|sedan|Bugatti|M|blue|XX1234|CA|-0.1"
3.使用自定义AI模型
https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_custom_YOLO.html?highlight=yolov3
自定义AI模型需要实现自定义边界框解析函数(bounding box parser function),否则运行模型会报错。
0:00:04.191794733 7711 0x5611f74a9b20 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::parseBoundingBox() <nvdsinfer_context_impl_output_parsing.cpp:59> [UID = 1]: Could not find output coverage layer for parsing objects
0:00:04.191854933 7711 0x5611f74a9b20 ERROR nvinfer gstnvinfer.cpp:613:gst_nvinfer_logger:<primary_gie> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::fillDetectionOutput() <nvdsinfer_context_impl_output_parsing.cpp:733> [UID = 1]: Failed to parse bboxes
Segmentation fault (core dumped)
Deepstream目前内置以下output parsers,除此之外需要自定义实现:
FasterRCNN
MaskRCNN
YoloV3 / YoloV3Tiny / YoloV2 / YoloV2Tiny
DetectNet
1)nvdsinfer_customparser示例程序
DeepStream supports NVIDIA® TensorRT™ plugins for custom layers.
位置:/opt/nvidia/deepstream/deepstream-5.1/sources/libs/nvdsinfer_customparser
。
ls nvdsinfer_customparser
Makefile nvdsinfer_custombboxparser.cpp nvdsinfer_customclassifierparser.cpp README
nvdsinfer_custombboxparser.cpp
#include <cstring>
#include <iostream>
#include "nvdsinfer_custom_impl.h"
#include <cassert>
#define MIN(a,b) ((a) < (b) ? (a) : (b))
#define MAX(a,b) ((a) > (b) ? (a) : (b))
#define CLIP(a,min,max) (MAX(MIN(a, max), min))
#define DIVIDE_AND_ROUND_UP(a, b) ((a + b - 1) / b)
struct MrcnnRawDetection {
float y1, x1, y2, x2, class_id, score;
/* This is a sample bounding box parsing function for the sample Resnet10
* detector model provided with the SDK. */
/* C-linkage to prevent name-mangling */
extern "C"
bool NvDsInferParseCustomResnet (std::vector<NvDsInferLayerInfo> const &outputLayersInfo,
NvDsInferNetworkInfo const &networkInfo,
NvDsInferParseDetectionParams const &detectionParams,
std::vector<NvDsInferObjectDetectionInfo> &objectList);
extern "C"
bool NvDsInferParseCustomResnet (std::vector<NvDsInferLayerInfo> const &outputLayersInfo,
NvDsInferNetworkInfo const &networkInfo,
NvDsInferParseDetectionParams const &detectionParams,
std::vector<NvDsInferObjectDetectionInfo> &objectList)
static NvDsInferDimsCHW covLayerDims;
static NvDsInferDimsCHW bboxLayerDims;
static int bboxLayerIndex = -1;
static int covLayerIndex = -1;
static bool classMismatchWarn = false;
int numClassesToParse;
/* Find the bbox layer */
if (bboxLayerIndex == -1) {
for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
if (strcmp(outputLayersInfo[i].layerName, "conv2d_bbox") == 0) {
bboxLayerIndex = i;
getDimsCHWFromDims(bboxLayerDims, outputLayersInfo[i].inferDims);
break;
if (bboxLayerIndex == -1) {
std::cerr << "Could not find bbox layer buffer while parsing" << std::endl;
return false;
/* Find the cov layer */
if (covLayerIndex == -1) {
for (unsigned int i = 0; i < outputLayersInfo.size(); i++) {
if (strcmp(outputLayersInfo[i].layerName, "conv2d_cov/Sigmoid") == 0) {
covLayerIndex = i;
getDimsCHWFromDims(covLayerDims, outputLayersInfo[i].inferDims);
break;
if (covLayerIndex == -1) {
std::cerr << "Could not find bbox layer buffer while parsing" << std::endl;
return false;
/* Warn in case of mismatch in number of classes */
if (!classMismatchWarn) {
if (covLayerDims.c != detectionParams.numClassesConfigured) {
std::cerr << "WARNING: Num classes mismatch. Configured:" <<
detectionParams.numClassesConfigured << ", detected by network: " <<
covLayerDims.c << std::endl;
classMismatchWarn = true;
/* Calculate the number of classes to parse */
numClassesToParse = MIN (covLayerDims.c, detectionParams.numClassesConfigured);
int gridW = covLayerDims.w;
int gridH = covLayerDims.h;
int gridSize = gridW * gridH;
float gcCentersX[gridW];
float gcCentersY[gridH];
float bboxNormX = 35.0;
float bboxNormY = 35.0;
float *outputCovBuf = (float *) outputLayersInfo[covLayerIndex].buffer;
float *outputBboxBuf = (float *) outputLayersInfo[bboxLayerIndex].buffer;
int strideX = DIVIDE_AND_ROUND_UP(networkInfo.width, bboxLayerDims.w);
int strideY = DIVIDE_AND_ROUND_UP(networkInfo.height, bboxLayerDims.h);
for (int i = 0; i < gridW; i++)
gcCentersX[i] = (float)(i * strideX + 0.5);
gcCentersX[i] /= (float)bboxNormX;
for (int i = 0; i < gridH; i++)
gcCentersY[i] = (float)(i * strideY + 0.5);
gcCentersY[i] /= (float)bboxNormY;
for (int c = 0; c < numClassesToParse; c++)
float *outputX1 = outputBboxBuf + (c * 4 * bboxLayerDims.h * bboxLayerDims.w);
float *outputY1 = outputX1 + gridSize;
float *outputX2 = outputY1 + gridSize;
float *outputY2 = outputX2 + gridSize;
float threshold = detectionParams.perClassPreclusterThreshold[c];
for (int h = 0; h < gridH; h++)
for (int w = 0; w < gridW; w++)
int i = w + h * gridW;
if (outputCovBuf[c * gridSize + i] >= threshold)
NvDsInferObjectDetectionInfo object;
float rectX1f, rectY1f, rectX2f, rectY2f;
rectX1f = (outputX1[w + h * gridW] - gcCentersX[w]) * -bboxNormX;
rectY1f = (outputY1[w + h * gridW] - gcCentersY[h]) * -bboxNormY;
rectX2f = (outputX2[w + h * gridW] + gcCentersX[w]) * bboxNormX;
rectY2f = (outputY2[w + h * gridW] + gcCentersY[h]) * bboxNormY;
object.classId = c;
object.detectionConfidence = outputCovBuf[c * gridSize + i];
/* Clip object box co-ordinates to network resolution */
object.left = CLIP(rectX1f, 0, networkInfo.width - 1);
object.top = CLIP(rectY1f, 0, networkInfo.height - 1);
object.width = CLIP(rectX2f, 0, networkInfo.width - 1) -
object.left + 1;
object.height = CLIP(rectY2f, 0, networkInfo.height - 1) -
object.top + 1;
objectList.push_back(object);
return true;
CHECK_CUSTOM_PARSE_FUNC_PROTOTYPE(NvDsInferParseCustomResnet);
2)修改模型的nvinfer配置文件
# For resnet10 detector
parse-bbox-func-name=NvDsInferParseCustomResnet
custom-lib-path=/path/to/this/directory/libnvds_infercustomparser.so
# For resnet18 vehicle type classifier
parse-classifier-func-name=NvDsInferClassiferParseCustomSoftmax
custom-lib-path=/path/to/this/directory/libnvds_infercustomparser.so
# For Tensorflow/Onnx SSD detector within nvinferserver
infer_config {
postprocess { detection {
custom_parse_bbox_func: "NvDsInferParseCustomTfSSD"
custom_lib {
path: "/path/to/this/directory/libnvds_infercustomparser.so"