本文介绍如何使用 Open Neural Network Exchange (ONNX) 对从 Azure 机器学习中的自动机器学习 (AutoML) 生成的计算机视觉模型进行预测。
若要使用 ONNX 进行预测,需执行以下操作:
从 AutoML 训练运行下载 ONNX 模型文件。
了解 ONNX 模型的输入和输出。
预处理数据,使其成为输入图像所需的格式。
对 Python 的 ONNX 运行时执行推理。
可视化对象检测和实例分段任务的预测。
ONNX
是机器学习和深度学习模型的开放标准。 它支持在常用的 AI 框架中进行模型导入和导出(互操作性)。 更多详细信息,请浏览
ONNX GitHub 项目
。
ONNX 运行时
是一个支持跨平台推理的开源项目。 ONNX 运行时提供跨编程语言(包括 Python、C++、C#、C、Java 和 JavaScript)的 API。 可以使用这些 API 对输入图像执行推理。 将模型导出为 ONNX 格式后,便可以在项目需要的任何编程语言中使用这些 API。
本指南将介绍如何使用
用于 ONNX 运行时的 Python API
对常见视觉任务的图像进行预测。 可以跨语言使用这些 ONNX 导出的模型。
对任何受支持的图像任务(分类、对象检测或实例分段)获取经 AutoML 训练的计算机视觉模型。
详细了解 AutoML 对计算机视觉任务的支持
。
安装
onnxruntime
包。 本文中的方法已使用 1.3.0-1.8.0 版本进行了测试。
下载 ONNX 模型文件
可以使用 Azure 机器学习工作室 UI 或 Azure 机器学习 Python SDK 从 AutoML 运行下载 ONNX 模型文件。 建议使用具有实验名称和父运行 ID 的 SDK 进行下载。
Azure 机器学习工作室
在 Azure 机器学习工作室中,通过训练笔记本中生成的指向实验的超链接进入实验,或选择“资产”下的“实验”选项卡中实验名称进入实验 。 然后,选择最佳子运行。
在最佳子运行中,转到“输出+日志”>“train_artifacts” 。 使用“下载”按钮手动下载以下文件:
labels.json
:包含训练数据集中所有类或标签的文件。
model.onnx
:ONNX 格式的模型。
将下载的模型文件保存到目录。 本文中的示例使用 ./automl_models 目录。
Azure 机器学习 Python SDK
在 SDK 中,可以使用实验名称和父运行 ID 选择最佳子运行(按主要指标)。 然后,可以下载 labels.json 和 model.onnx 文件 。
以下代码根据相关的主要指标返回最佳子运行。
from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient
mlflow_client = MlflowClient()
credential = DefaultAzureCredential()
ml_client = None
ml_client = MLClient.from_config(credential)
except Exception as ex:
print(ex)
# Enter details of your Azure Machine Learning workspace
subscription_id = ''
resource_group = ''
workspace_name = ''
ml_client = MLClient(credential, subscription_id, resource_group, workspace_name)
import mlflow
from mlflow.tracking.client import MlflowClient
# Obtain the tracking URL from MLClient
MLFLOW_TRACKING_URI = ml_client.workspaces.get(
name=ml_client.workspace_name
).mlflow_tracking_uri
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)
# Specify the job name
job_name = ''
# Get the parent run
mlflow_parent_run = mlflow_client.get_run(job_name)
best_child_run_id = mlflow_parent_run.data.tags['automl_best_child_run_id']
# get the best child run
best_run = mlflow_client.get_run(best_child_run_id)
下载 labels.json 文件,其中包含训练数据集中的所有类和标签。
local_dir = './automl_models'
if not os.path.exists(local_dir):
os.mkdir(local_dir)
labels_file = mlflow_client.download_artifacts(
best_run.info.run_id, 'train_artifacts/labels.json', local_dir
下载 model.onnx 文件。
onnx_model_path = mlflow_client.download_artifacts(
best_run.info.run_id, 'train_artifacts/model.onnx', local_dir
如果使用 ONNX 模型对对象检测和实例分段进行批量推理,请参阅有关批量评分的模型生成的部分。
生成模型进行批量评分
默认情况下,AutoML for Images 支持分类的批量评分。 但是对象检测和实例分割 ONNX 模型不支持批量推理。 若要对于对象检测和实例分段进行批量推断,请使用以下过程为所需的批大小生成 ONNX 模型。 为特定批大小生成的模型不能用于其他批大小。
下载 conda 环境文件,并创建用于命令作业的环境对象。
# Download conda file and define the environment
conda_file = mlflow_client.download_artifacts(
best_run.info.run_id, "outputs/conda_env_v_1_0_0.yml", local_dir
from azure.ai.ml.entities import Environment
env = Environment(
name="automl-images-env-onnx",
description="environment for automl images ONNX batch model generation",
image="mcr.microsoft.com/azureml/openmpi4.1.0-cuda11.1-cudnn8-ubuntu18.04",
conda_file=conda_file,
使用以下模型特定参数提交脚本。 有关参数的更多详细信息,请参阅模型特定超参数;有关受支持的对象检测模型名称,请参阅受支持的模型体系结构部分。
若要获取创建批处理评分模型所需的参数值,请参阅 AutoML 训练运行 outputs 文件夹下生成的评分脚本。 使用最佳子运行评分文件内模型设置变量中提供的超参数值。
多类图像分类
多标签图像分类
使用 Faster R-CNN 或 RetinaNet 进行对象检测
使用 YOLO 进行对象检测
inputs = {'model_name': 'fasterrcnn_resnet34_fpn', # enter the faster rcnn or retinanet model name
'batch_size': 8, # enter the batch size of your choice
'height_onnx': 600, # enter the height of input to ONNX model
'width_onnx': 800, # enter the width of input to ONNX model
'job_name': job_name,
'task_type': 'image-object-detection',
'min_size': 600, # minimum size of the image to be rescaled before feeding it to the backbone
'max_size': 1333, # maximum size of the image to be rescaled before feeding it to the backbone
'box_score_thresh': 0.3, # threshold to return proposals with a classification score > box_score_thresh
'box_nms_thresh': 0.5, # NMS threshold for the prediction head
'box_detections_per_img': 100 # maximum number of detections per image, for all classes
inputs = {'model_name': 'yolov5', # enter the yolo model name
'batch_size': 8, # enter the batch size of your choice
'height_onnx': 640, # enter the height of input to ONNX model
'width_onnx': 640, # enter the width of input to ONNX model
'job_name': job_name,
'task_type': 'image-object-detection',
'img_size': 640, # image size for inference
'model_size': 'small', # size of the yolo model
'box_score_thresh': 0.1, # threshold to return proposals with a classification score > box_score_thresh
'box_iou_thresh': 0.5
inputs = {'model_name': 'maskrcnn_resnet50_fpn', # enter the maskrcnn model name
'batch_size': 8, # enter the batch size of your choice
'height_onnx': 600, # enter the height of input to ONNX model
'width_onnx': 800, # enter the width of input to ONNX model
'job_name': job_name,
'task_type': 'image-instance-segmentation',
'min_size': 600, # minimum size of the image to be rescaled before feeding it to the backbone
'max_size': 1333, # maximum size of the image to be rescaled before feeding it to the backbone
'box_score_thresh': 0.3, # threshold to return proposals with a classification score > box_score_thresh
'box_nms_thresh': 0.5, # NMS threshold for the prediction head
'box_detections_per_img': 100 # maximum number of detections per image, for all classes
job = command(
code="./onnx_generator_files", # local path where the code is stored
command="python ONNX_batch_model_generator_automl_for_images.py --model_name ${{inputs.model_name}} --batch_size ${{inputs.batch_size}} --height_onnx ${{inputs.height_onnx}} --width_onnx ${{inputs.width_onnx}} --job_name ${{inputs.job_name}} --task_type ${{inputs.task_type}} --min_size ${{inputs.min_size}} --max_size ${{inputs.max_size}} --box_score_thresh ${{inputs.box_score_thresh}} --box_nms_thresh ${{inputs.box_nms_thresh}} --box_detections_per_img ${{inputs.box_detections_per_img}}",
inputs=inputs,
environment=env,
compute=compute_name,
display_name="ONNX-batch-model-generation-rcnn",
description="Use the PyTorch to generate ONNX batch scoring model.",
returned_job = ml_client.create_or_update(job)
ml_client.jobs.stream(returned_job.name)
job = command(
code="./onnx_generator_files", # local path where the code is stored
command="python ONNX_batch_model_generator_automl_for_images.py --model_name ${{inputs.model_name}} --batch_size ${{inputs.batch_size}} --height_onnx ${{inputs.height_onnx}} --width_onnx ${{inputs.width_onnx}} --job_name ${{inputs.job_name}} --task_type ${{inputs.task_type}} --img_size ${{inputs.img_size}} --model_size ${{inputs.model_size}} --box_score_thresh ${{inputs.box_score_thresh}} --box_iou_thresh ${{inputs.box_iou_thresh}}",
inputs=inputs,
environment=env,
compute=compute_name,
display_name="ONNX-batch-model-generation",
description="Use the PyTorch to generate ONNX batch scoring model.",
returned_job = ml_client.create_or_update(job)
ml_client.jobs.stream(returned_job.name)
job = command(
code="./onnx_generator_files", # local path where the code is stored
command="python ONNX_batch_model_generator_automl_for_images.py --model_name ${{inputs.model_name}} --batch_size ${{inputs.batch_size}} --height_onnx ${{inputs.height_onnx}} --width_onnx ${{inputs.width_onnx}} --job_name ${{inputs.job_name}} --task_type ${{inputs.task_type}} --min_size ${{inputs.min_size}} --max_size ${{inputs.max_size}} --box_score_thresh ${{inputs.box_score_thresh}} --box_nms_thresh ${{inputs.box_nms_thresh}} --box_detections_per_img ${{inputs.box_detections_per_img}}",
inputs=inputs,
environment=env,
compute=compute_name,
display_name="ONNX-batch-model-generation-maskrcnn",
description="Use the PyTorch to generate ONNX batch scoring model.",
returned_job = ml_client.create_or_update(job)
ml_client.jobs.stream(returned_job.name)
生成批模型后,可手动通过 UI 从 Outputs+>logsoutputs 中下载,或者使用以下方法:
batch_size = 8 # use the batch size used to generate the model
returned_job_run = mlflow_client.get_run(returned_job.name)
# Download run's artifacts/outputs
onnx_model_path = mlflow_client.download_artifacts(
returned_job_run.info.run_id, 'outputs/model_'+str(batch_size)+'.onnx', local_dir
完成模型下载步骤后,使用 ONNX 运行时 Python 包并使用 model.onnx 文件执行推理。 为进行演示,本文使用如何为每个视觉任务准备图像数据集中的数据集。
我们已使用所有视觉任务各自的数据集为其训练了模型,以演示 ONNX 模型推理。
加载标签和 ONNX 模型文件
以下代码片段加载 labels.json,其中类名已排序。 也就是说,如果 ONNX 模型预测标签 ID 为 2,则它对应于 labels.json 文件中的第三个索引给出的标签名称。
import json
import onnxruntime
labels_file = "automl_models/labels.json"
with open(labels_file) as f:
classes = json.load(f)
print(classes)
session = onnxruntime.InferenceSession(onnx_model_path)
print("ONNX model loaded...")
except Exception as e:
print("Error loading ONNX file: ", str(e))
使用模型时,务必了解一些特定于模型和特定于任务的详细信息。 这些详细信息包括输入数量和输出数量、用于预处理图像的预期输入形状或格式,以及输出形状,确保你了解特定于模型或特定于任务的输出。
sess_input = session.get_inputs()
sess_output = session.get_outputs()
print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")
for idx, input_ in enumerate(range(len(sess_input))):
input_name = sess_input[input_].name
input_shape = sess_input[input_].shape
input_type = sess_input[input_].type
print(f"{idx} Input name : { input_name }, Input shape : {input_shape}, \
Input type : {input_type}")
for idx, output in enumerate(range(len(sess_output))):
output_name = sess_output[output].name
output_shape = sess_output[output].shape
output_type = sess_output[output].type
print(f" {idx} Output name : {output_name}, Output shape : {output_shape}, \
Output type : {output_type}")
每个 ONNX 模型都有一组预定义的输入和输出格式。
多类图像分类
多标签图像分类
使用 Faster R-CNN 或 RetinaNet 进行对象检测
使用 YOLO 进行对象检测
此示例应用具有 134 个图像和 4 个类/标签的 fridgeObjects 数据集上训练的模型,以说明 ONNX 模型推理。 有关训练图像分类任务的详细信息,请参阅多类图像分类笔记本。
输入是经过预处理的图像。
此示例使用具有 128 个图像和 4 个类/标签的多标签 fridgeObjects 数据集上训练的模型,以说明 ONNX 模型推理。 有关多标签图像分类的模型训练的详细信息,请参阅多标签图像分类笔记本。
输入是经过预处理的图像。
此对象检测示例使用具有 128 个图像和 4 个类/标签的 fridgeObjects 检测数据集上训练的模型,以说明 ONNX 模型推理。 此示例训练 Faster R-CNN 模型,以演示推理步骤。 有关训练对象检测模型的详细信息,请参阅对象检测笔记本。
输入是经过预处理的图像。
(3*batch_size)
关键值列表
对于批大小为 2,output_names
将为 ['boxes_0', 'labels_0', 'scores_0', 'boxes_1', 'labels_1', 'scores_1']
predictions
(3*batch_size)
ndarray(float) 列表
对于批大小为 2,predictions
的形状将为 [(n1_boxes, 4), (n1_boxes), (n1_boxes), (n2_boxes, 4), (n2_boxes), (n2_boxes)]
。 对此,每个索引的值与 output_names
中相同的索引对应。
下表描述了为图像批中每个示例返回的框、标签和分数。
此对象检测示例使用具有 128 个图像和 4 个类/标签的 fridgeObjects 检测数据集上训练的模型,以说明 ONNX 模型推理。 此示例训练 YOLO 模型,以演示推理步骤。 有关训练对象检测模型的详细信息,请参阅对象检测笔记本。
输入是经过预处理的图像,形状为 (1, 3, 640, 640)
,批大小为 1,高度和宽度为 640。 这些数字对应于训练示例中所用的值。
对于该实例分段示例,使用具有 128 个图像和 4 个类/标签的 fridgeObjects 数据集上训练的 Mask R-CNN 模型,以说明 ONNX 模型推理。 有关训练实例分段模型的详细信息,请参阅实例分段笔记本。
实例分段任务仅支持掩码 R-CNN。 输入和输出格式仅基于掩码 R-CNN。
输入是经过预处理的图像。 已导出适用于 Mask R-CNN 的 ONNX 模型,用于处理各种形状的图像。 建议将其调整为与训练图像大小一致的固定大小,以提高性能。
(4*batch_size)
关键值列表
对于批大小为 2,output_names
将为 ['boxes_0', 'labels_0', 'scores_0', 'masks_0', 'boxes_1', 'labels_1', 'scores_1', 'masks_1']
predictions
(4*batch_size)
ndarray(float) 列表
对于批大小为 2,predictions
的形状将为 [(n1_boxes, 4), (n1_boxes), (n1_boxes), (n1_boxes, 1, height_onnx, width_onnx), (n2_boxes, 4), (n2_boxes), (n2_boxes), (n2_boxes, 1, height_onnx, width_onnx)]
。 对此,每个索引的值与 output_names
中相同的索引对应。
将图像转换为 RGB。
将图像大小调整为 valid_resize_size
和 valid_resize_size
值,这些值对应于训练期间验证数据集转换时使用的值。 valid_resize_size
的默认值为 256。
将图像中心裁剪为 height_onnx_crop_size
和 width_onnx_crop_size
。 它与 valid_crop_size
对应,默认值为 224。
将 HxWxC
更改为 CxHxW
。
转换为 float 型。
使用 ImageNet 的 mean
= [0.485, 0.456, 0.406]
和 std
= [0.229, 0.224, 0.225]
进行规范化。
如果在训练期间为超参数valid_resize_size
和 valid_crop_size
选择了不同的值,则应使用这些值。
获取 ONNX 模型所需的输入形状。
batch, channel, height_onnx_crop_size, width_onnx_crop_size = session.get_inputs()[0].shape
batch, channel, height_onnx_crop_size, width_onnx_crop_size
无 PyTorch
import glob
import numpy as np
from PIL import Image
def preprocess(image, resize_size, crop_size_onnx):
"""Perform pre-processing on raw input image
:param image: raw input image
:type image: PIL image
:param resize_size: value to resize the image
:type image: Int
:param crop_size_onnx: expected height of an input image in onnx model
:type crop_size_onnx: Int
:return: pre-processed image in numpy format
:rtype: ndarray 1xCxHxW
image = image.convert('RGB')
# resize
image = image.resize((resize_size, resize_size))
# center crop
left = (resize_size - crop_size_onnx)/2
top = (resize_size - crop_size_onnx)/2
right = (resize_size + crop_size_onnx)/2
bottom = (resize_size + crop_size_onnx)/2
image = image.crop((left, top, right, bottom))
np_image = np.array(image)
# HWC -> CHW
np_image = np_image.transpose(2, 0, 1) # CxHxW
# normalize the image
mean_vec = np.array([0.485, 0.456, 0.406])
std_vec = np.array([0.229, 0.224, 0.225])
norm_img_data = np.zeros(np_image.shape).astype('float32')
for i in range(np_image.shape[0]):
norm_img_data[i,:,:] = (np_image[i,:,:]/255 - mean_vec[i])/std_vec[i]
np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
return np_image
# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images
test_images_path = "automl_models_multi_cls/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
img = Image.open(image_files[i])
img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
if len(img_processed_list) > 1:
img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
img_data = img_processed_list[0]
else:
img_data = None
assert batch_size == img_data.shape[0]
有 PyTorch
import glob
import torch
import numpy as np
from PIL import Image
from torchvision import transforms
def _make_3d_tensor(x) -> torch.Tensor:
"""This function is for images that have less channels.
:param x: input tensor
:type x: torch.Tensor
:return: return a tensor with the correct number of channels
:rtype: torch.Tensor
return x if x.shape[0] == 3 else x.expand((3, x.shape[1], x.shape[2]))
def preprocess(image, resize_size, crop_size_onnx):
transform = transforms.Compose([
transforms.Resize(resize_size),
transforms.CenterCrop(crop_size_onnx),
transforms.ToTensor(),
transforms.Lambda(_make_3d_tensor),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
img_data = transform(image)
img_data = img_data.numpy()
img_data = np.expand_dims(img_data, axis=0)
return img_data
# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images
test_images_path = "automl_models_multi_cls/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
img = Image.open(image_files[i])
img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
if len(img_processed_list) > 1:
img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
img_data = img_processed_list[0]
else:
img_data = None
assert batch_size == img_data.shape[0]
将图像转换为 RGB。
将图像大小调整为 valid_resize_size
和 valid_resize_size
值,这些值对应于训练期间验证数据集转换时使用的值。 valid_resize_size
的默认值为 256。
将图像中心裁剪为 height_onnx_crop_size
和 width_onnx_crop_size
。 这对应于默认值为 224 的 valid_crop_size
。
将 HxWxC
更改为 CxHxW
。
转换为 float 型。
使用 ImageNet 的 mean
= [0.485, 0.456, 0.406]
和 std
= [0.229, 0.224, 0.225]
进行规范化。
如果在训练期间为超参数valid_resize_size
和 valid_crop_size
选择了不同的值,则应使用这些值。
获取 ONNX 模型所需的输入形状。
batch, channel, height_onnx_crop_size, width_onnx_crop_size = session.get_inputs()[0].shape
batch, channel, height_onnx_crop_size, width_onnx_crop_size
无 PyTorch
import glob
import numpy as np
from PIL import Image
def preprocess(image, resize_size, crop_size_onnx):
"""Perform pre-processing on raw input image
:param image: raw input image
:type image: PIL image
:param resize_size: value to resize the image
:type image: Int
:param crop_size_onnx: expected height of an input image in onnx model
:type crop_size_onnx: Int
:return: pre-processed image in numpy format
:rtype: ndarray 1xCxHxW
image = image.convert('RGB')
# resize
image = image.resize((resize_size, resize_size))
# center crop
left = (resize_size - crop_size_onnx)/2
top = (resize_size - crop_size_onnx)/2
right = (resize_size + crop_size_onnx)/2
bottom = (resize_size + crop_size_onnx)/2
image = image.crop((left, top, right, bottom))
np_image = np.array(image)
# HWC -> CHW
np_image = np_image.transpose(2, 0, 1) # CxHxW
# normalize the image
mean_vec = np.array([0.485, 0.456, 0.406])
std_vec = np.array([0.229, 0.224, 0.225])
norm_img_data = np.zeros(np_image.shape).astype('float32')
for i in range(np_image.shape[0]):
norm_img_data[i,:,:] = (np_image[i,:,:] / 255 - mean_vec[i]) / std_vec[i]
np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
return np_image
# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images
test_images_path = "automl_models_multi_label/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
img = Image.open(image_files[i])
img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
if len(img_processed_list) > 1:
img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
img_data = img_processed_list[0]
else:
img_data = None
assert batch_size == img_data.shape[0]
有 PyTorch
import glob
import torch
import numpy as np
from PIL import Image
from torchvision import transforms
def _make_3d_tensor(x) -> torch.Tensor:
"""This function is for images that have less channels.
:param x: input tensor
:type x: torch.Tensor
:return: return a tensor with the correct number of channels
:rtype: torch.Tensor
return x if x.shape[0] == 3 else x.expand((3, x.shape[1], x.shape[2]))
def preprocess(image, resize_size, crop_size_onnx):
"""Perform pre-processing on raw input image
:param image: raw input image
:type image: PIL image
:param resize_size: value to resize the image
:type image: Int
:param crop_size_onnx: expected height of an input image in onnx model
:type crop_size_onnx: Int
:return: pre-processed image in numpy format
:rtype: ndarray 1xCxHxW
transform = transforms.Compose([
transforms.Resize(resize_size),
transforms.CenterCrop(crop_size_onnx),
transforms.ToTensor(),
transforms.Lambda(_make_3d_tensor),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
img_data = transform(image)
img_data = img_data.numpy()
img_data = np.expand_dims(img_data, axis=0)
return img_data
# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images
test_images_path = "automl_models_multi_label/test_images_dir/*" # replace with path to images
# Select batch size needed
batch_size = 8
# you can modify resize_size based on your trained model
resize_size = 256
# height and width will be the same for classification
crop_size_onnx = height_onnx_crop_size
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
img = Image.open(image_files[i])
img_processed_list.append(preprocess(img, resize_size, crop_size_onnx))
if len(img_processed_list) > 1:
img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
img_data = img_processed_list[0]
else:
img_data = None
assert batch_size == img_data.shape[0]
对于具有 Faster R-CNN 体系结构的对象检测,请遵循与图像分类相同的预处理步骤,但图像裁剪除外。 可以将图像大小调整为高 600
和宽 800
。 可以使用以下代码获取预期的输入高度和宽度。
batch, channel, height_onnx, width_onnx = session.get_inputs()[0].shape
batch, channel, height_onnx, width_onnx
然后,执行预处理步骤。
import glob
import numpy as np
from PIL import Image
def preprocess(image, height_onnx, width_onnx):
"""Perform pre-processing on raw input image
:param image: raw input image
:type image: PIL image
:param height_onnx: expected height of an input image in onnx model
:type height_onnx: Int
:param width_onnx: expected width of an input image in onnx model
:type width_onnx: Int
:return: pre-processed image in numpy format
:rtype: ndarray 1xCxHxW
image = image.convert('RGB')
image = image.resize((width_onnx, height_onnx))
np_image = np.array(image)
# HWC -> CHW
np_image = np_image.transpose(2, 0, 1) # CxHxW
# normalize the image
mean_vec = np.array([0.485, 0.456, 0.406])
std_vec = np.array([0.229, 0.224, 0.225])
norm_img_data = np.zeros(np_image.shape).astype('float32')
for i in range(np_image.shape[0]):
norm_img_data[i,:,:] = (np_image[i,:,:] / 255 - mean_vec[i]) / std_vec[i]
np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
return np_image
# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images
test_images_path = "automl_models_od/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
img = Image.open(image_files[i])
img_processed_list.append(preprocess(img, height_onnx, width_onnx))
if len(img_processed_list) > 1:
img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
img_data = img_processed_list[0]
else:
img_data = None
assert batch_size == img_data.shape[0]
对于具有 YOLO 体系结构的对象检测,请遵循与图像分类相同的预处理步骤,但图像裁剪除外。 可以将图像调整为高度 600
和宽度 800
,并使用以下代码获取预期的输入高度和宽度。
batch, channel, height_onnx, width_onnx = session.get_inputs()[0].shape
batch, channel, height_onnx, width_onnx
有关 YOLO 所需的预处理,请参阅 yolo_onnx_preprocessing_utils.py。
import glob
import numpy as np
from yolo_onnx_preprocessing_utils import preprocess
# use height and width based on the generated model
test_images_path = "automl_models_od_yolo/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
pad_list = []
for i in range(batch_size):
img_processed, pad = preprocess(image_files[i])
img_processed_list.append(img_processed)
pad_list.append(pad)
if len(img_processed_list) > 1:
img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
img_data = img_processed_list[0]
else:
img_data = None
assert batch_size == img_data.shape[0]
将 HxWxC
更改为 CxHxW
。
转换为 float 型。
使用 ImageNet 的 mean
= [0.485, 0.456, 0.406]
和 std
= [0.229, 0.224, 0.225]
进行规范化。
对于 resize_height
和 resize_width
,还可以使用训练期间使用的值,但受 Mask R-CNN 的 min_size
和 max_size
超参数 限制。
import glob
import numpy as np
from PIL import Image
def preprocess(image, resize_height, resize_width):
"""Perform pre-processing on raw input image
:param image: raw input image
:type image: PIL image
:param resize_height: resize height of an input image
:type resize_height: Int
:param resize_width: resize width of an input image
:type resize_width: Int
:return: pre-processed image in numpy format
:rtype: ndarray of shape 1xCxHxW
image = image.convert('RGB')
image = image.resize((resize_width, resize_height))
np_image = np.array(image)
# HWC -> CHW
np_image = np_image.transpose(2, 0, 1) # CxHxW
# normalize the image
mean_vec = np.array([0.485, 0.456, 0.406])
std_vec = np.array([0.229, 0.224, 0.225])
norm_img_data = np.zeros(np_image.shape).astype('float32')
for i in range(np_image.shape[0]):
norm_img_data[i,:,:] = (np_image[i,:,:]/255 - mean_vec[i])/std_vec[i]
np_image = np.expand_dims(norm_img_data, axis=0) # 1xCxHxW
return np_image
# following code loads only batch_size number of images for demonstrating ONNX inference
# make sure that the data directory has at least batch_size number of images
# use height and width based on the trained model
# use height and width based on the generated model
test_images_path = "automl_models_is/test_images_dir/*" # replace with path to images
image_files = glob.glob(test_images_path)
img_processed_list = []
for i in range(batch_size):
img = Image.open(image_files[i])
img_processed_list.append(preprocess(img, height_onnx, width_onnx))
if len(img_processed_list) > 1:
img_data = np.concatenate(img_processed_list)
elif len(img_processed_list) == 1:
img_data = img_processed_list[0]
else:
img_data = None
assert batch_size == img_data.shape[0]
使用 ONNX 运行时进行推理
使用 ONNX 运行时进行推理因各个计算机视觉任务而异。
多类图像分类
多标签图像分类
使用 Faster R-CNN 或 RetinaNet 进行对象检测
使用 YOLO 进行对象检测
def get_predictions_from_ONNX(onnx_session, img_data):
"""Perform predictions with ONNX runtime
:param onnx_session: onnx model session
:type onnx_session: class InferenceSession
:param img_data: pre-processed numpy image
:type img_data: ndarray with shape 1xCxHxW
:return: scores with shapes
(1, No. of classes in training dataset)
:rtype: numpy array
sess_input = onnx_session.get_inputs()
sess_output = onnx_session.get_outputs()
print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")
# predict with ONNX Runtime
output_names = [ output.name for output in sess_output]
scores = onnx_session.run(output_names=output_names,\
input_feed={sess_input[0].name: img_data})
return scores[0]
scores = get_predictions_from_ONNX(session, img_data)
def get_predictions_from_ONNX(onnx_session,img_data):
"""Perform predictions with ONNX runtime
:param onnx_session: onnx model session
:type onnx_session: class InferenceSession
:param img_data: pre-processed numpy image
:type img_data: ndarray with shape 1xCxHxW
:return: scores with shapes
(1, No. of classes in training dataset)
:rtype: numpy array
sess_input = onnx_session.get_inputs()
sess_output = onnx_session.get_outputs()
print(f"No. of inputs : {len(sess_input)}, No. of outputs : {len(sess_output)}")
# predict with ONNX Runtime
output_names = [ output.name for output in sess_output]
scores = onnx_session.run(output_names=output_names,\
input_feed={sess_input[0].name: img_data})
return scores[0]
scores = get_predictions_from_ONNX(session, img_data)
def get_predictions_from_ONNX(onnx_session, img_data):
"""perform predictions with ONNX runtime
:param onnx_session: onnx model session
:type onnx_session: class InferenceSession
:param img_data: pre-processed numpy image
:type img_data: ndarray with shape 1xCxHxW
:return: boxes, labels , scores
(No. of boxes, 4) (No. of boxes,) (No. of boxes,)
:rtype: tuple
sess_input = onnx_session.get_inputs()
sess_output = onnx_session.get_outputs()
# predict with ONNX Runtime
output_names = [output.name for output in sess_output]
predictions = onnx_session.run(output_names=output_names,\
input_feed={sess_input[0].name: img_data})
return output_names, predictions
output_names, predictions = get_predictions_from_ONNX(session, img_data)
def get_predictions_from_ONNX(onnx_session,img_data):
"""perform predictions with ONNX Runtime
:param onnx_session: onnx model session
:type onnx_session: class InferenceSession
:param img_data: pre-processed numpy image
:type img_data: ndarray with shape 1xCxHxW
:return: boxes, labels , scores
:rtype: list
sess_input = onnx_session.get_inputs()
sess_output = onnx_session.get_outputs()
# predict with ONNX Runtime
output_names = [ output.name for output in sess_output]
pred = onnx_session.run(output_names=output_names,\
input_feed={sess_input[0].name: img_data})
return pred[0]
result = get_predictions_from_ONNX(session, img_data)
实例分段模型预测框、标签、分数和掩码。 ONNX 输出每个实例的预测掩码,以及相应的边界框和类置信度分数。 必要时,可能需要从二进制掩码转换为 polygon。
def get_predictions_from_ONNX(onnx_session, img_data):
"""Perform predictions with ONNX runtime
:param onnx_session: onnx model session
:type onnx_session: class InferenceSession
:param img_data: pre-processed numpy image
:type img_data: ndarray with shape 1xCxHxW
:return: boxes, labels , scores , masks with shapes
(No. of instances, 4) (No. of instances,) (No. of instances,)
(No. of instances, 1, HEIGHT, WIDTH))
:rtype: tuple
sess_input = onnx_session.get_inputs()
sess_output = onnx_session.get_outputs()
# predict with ONNX Runtime
output_names = [ output.name for output in sess_output]
predictions = onnx_session.run(output_names=output_names,\
input_feed={sess_input[0].name: img_data})
return output_names, predictions
output_names, predictions = get_predictions_from_ONNX(session, img_data)
多类图像分类
多标签图像分类
使用 Faster R-CNN 或 RetinaNet 进行对象检测
使用 YOLO 进行对象检测
def softmax(x):
e_x = np.exp(x - np.max(x, axis=1, keepdims=True))
return e_x / np.sum(e_x, axis=1, keepdims=True)
conf_scores = softmax(scores)
class_preds = np.argmax(conf_scores, axis=1)
print("predicted classes:", ([(class_idx, classes[class_idx]) for class_idx in class_preds]))
有 PyTorch
conf_scores = torch.nn.functional.softmax(torch.from_numpy(scores), dim=1)
class_preds = torch.argmax(conf_scores, dim=1)
print("predicted classes:", ([(class_idx.item(), classes[class_idx]) for class_idx in class_preds]))
# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
conf_scores = sigmoid(scores)
image_wise_preds = np.where(conf_scores > score_threshold)
for image_idx, class_idx in zip(image_wise_preds[0], image_wise_preds[1]):
print('image: {}, class_index: {}, class_name: {}'.format(image_files[image_idx], class_idx, classes[class_idx]))
有 PyTorch
# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
conf_scores = torch.sigmoid(torch.from_numpy(scores))
image_wise_preds = torch.where(conf_scores > score_threshold)
for image_idx, class_idx in zip(image_wise_preds[0], image_wise_preds[1]):
print('image: {}, class_index: {}, class_name: {}'.format(image_files[image_idx], class_idx, classes[class_idx]))
对于多类和多标签分类,可以对 AutoML 中支持的所有模型体系结构执行上述步骤。
对于对象检测,预测将自动采用 height_onnx
、width_onnx
的比例。 若要将预测框坐标转换为原始尺寸,可以执行以下计算。
Xmin * original_width/width_onnx
Ymin * original_height/height_onnx
Xmax * original_width/width_onnx
Ymax * original_height/height_onnx
另一种方法是使用以下代码将框尺寸缩放到 [0, 1] 的范围内。 这样做,可以用框坐标与原始图像相应的高度和宽度坐标相乘(如将预测结果可视化部分所述),以获取原始图像维度中的框。
def _get_box_dims(image_shape, box):
box_keys = ['topX', 'topY', 'bottomX', 'bottomY']
height, width = image_shape[0], image_shape[1]
box_dims = dict(zip(box_keys, [coordinate.item() for coordinate in box]))
box_dims['topX'] = box_dims['topX'] * 1.0 / width
box_dims['bottomX'] = box_dims['bottomX'] * 1.0 / width
box_dims['topY'] = box_dims['topY'] * 1.0 / height
box_dims['bottomY'] = box_dims['bottomY'] * 1.0 / height
return box_dims
def _get_prediction(boxes, labels, scores, image_shape, classes):
bounding_boxes = []
for box, label_index, score in zip(boxes, labels, scores):
box_dims = _get_box_dims(image_shape, box)
box_record = {'box': box_dims,
'label': classes[label_index],
'score': score.item()}
bounding_boxes.append(box_record)
return bounding_boxes
# Filter the results with threshold.
# Please replace the threshold for your test scenario.
score_threshold = 0.8
filtered_boxes_batch = []
for batch_sample in range(0, batch_size*3, 3):
# in case of retinanet change the order of boxes, labels, scores to boxes, scores, labels
# confirm the same from order of boxes, labels, scores output_names
boxes, labels, scores = predictions[batch_sample], predictions[batch_sample + 1], predictions[batch_sample + 2]
bounding_boxes = _get_prediction(boxes, labels, scores, (height_onnx, width_onnx), classes)
filtered_bounding_boxes = [box for box in bounding_boxes if box['score'] >= score_threshold]
filtered_boxes_batch.append(filtered_bounding_boxes)
以下代码创建框、标签和分数。 使用这些边界框详细信息执行对 Faster R-CNN 模型所执行的后期处理步骤。
from yolo_onnx_preprocessing_utils import non_max_suppression, _convert_to_rcnn_output
result_final = non_max_suppression(
torch.from_numpy(result),
conf_thres=0.1,
iou_thres=0.5)
def _get_box_dims(image_shape, box):
box_keys = ['topX', 'topY', 'bottomX', 'bottomY']
height, width = image_shape[0], image_shape[1]
box_dims = dict(zip(box_keys, [coordinate.item() for coordinate in box]))
box_dims['topX'] = box_dims['topX'] * 1.0 / width
box_dims['bottomX'] = box_dims['bottomX'] * 1.0 / width
box_dims['topY'] = box_dims['topY'] * 1.0 / height
box_dims['bottomY'] = box_dims['bottomY'] * 1.0 / height
return box_dims
def _get_prediction(label, image_shape, classes):
boxes = np.array(label["boxes"])
labels = np.array(label["labels"])
labels = [label[0] for label in labels]
scores = np.array(label["scores"])
scores = [score[0] for score in scores]
bounding_boxes = []
for box, label_index, score in zip(boxes, labels, scores):
box_dims = _get_box_dims(image_shape, box)
box_record = {'box': box_dims,
'label': classes[label_index],
'score': score.item()}
bounding_boxes.append(box_record)
return bounding_boxes
bounding_boxes_batch = []
for result_i, pad in zip(result_final, pad_list):
label, image_shape = _convert_to_rcnn_output(result_i, height_onnx, width_onnx, pad)
bounding_boxes_batch.append(_get_prediction(label, image_shape, classes))
print(json.dumps(bounding_boxes_batch, indent=1))
%matplotlib inline
sample_image_index = 0 # change this for an image of interest from image_files list
IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img_np = mpimg.imread(image_files[sample_image_index])
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size
fig,ax = plt.subplots(1, figsize=(15, 15))
# Display the image
ax.imshow(img_np)
label = class_preds[sample_image_index]
if torch.is_tensor(label):
label = label.item()
conf_score = conf_scores[sample_image_index]
if torch.is_tensor(conf_score):
conf_score = np.max(conf_score.tolist())
else:
conf_score = np.max(conf_score)
display_text = '{} ({})'.format(label, round(conf_score, 3))
print(display_text)
color = 'red'
plt.text(30, 30, display_text, color=color, fontsize=30)
plt.show()
%matplotlib inline
sample_image_index = 0 # change this for an image of interest from image_files list
IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img_np = mpimg.imread(image_files[sample_image_index])
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size
fig,ax = plt.subplots(1, figsize=(15, 15))
# Display the image
ax.imshow(img_np)
# we apply a threshold of 0.5 on confidence scores
score_threshold = 0.5
label_offset_x = 30
label_offset_y = 30
if torch.is_tensor(conf_scores):
sample_image_scores = conf_scores[sample_image_index].tolist()
else:
sample_image_scores = conf_scores[sample_image_index]
for index, score in enumerate(sample_image_scores):
if score > score_threshold:
label = classes[index]
display_text = '{} ({})'.format(label, round(score, 3))
print(display_text)
color = 'red'
plt.text(label_offset_x, label_offset_y, display_text, color=color, fontsize=30)
label_offset_y += 30
plt.show()
import matplotlib.image as mpimg
import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline
img_np = mpimg.imread(image_files[1]) # replace with desired image index
image_boxes = filtered_boxes_batch[1] # replace with desired image index
IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size
print(img.size)
fig,ax = plt.subplots(1)
# Display the image
ax.imshow(img_np)
# Draw box and label for each detection
for detect in image_boxes:
label = detect['label']
box = detect['box']
ymin, xmin, ymax, xmax = box['topY'], box['topX'], box['bottomY'], box['bottomX']
topleft_x, topleft_y = x * xmin, y * ymin
width, height = x * (xmax - xmin), y * (ymax - ymin)
print('{}: {}, {}, {}, {}'.format(detect['label'], topleft_x, topleft_y, width, height))
rect = patches.Rectangle((topleft_x, topleft_y), width, height,
linewidth=1, edgecolor='green', facecolor='none')
ax.add_patch(rect)
color = 'green'
plt.text(topleft_x, topleft_y, label, color=color)
plt.show()
import matplotlib.image as mpimg
import matplotlib.patches as patches
import matplotlib.pyplot as plt
%matplotlib inline
img_np = mpimg.imread(image_files[1]) # replace with desired image index
image_boxes = bounding_boxes_batch[1] # replace with desired image index
IMAGE_SIZE = (18, 12)
plt.figure(figsize=IMAGE_SIZE)
img = Image.fromarray(img_np.astype('uint8'), 'RGB')
x, y = img.size
print(img.size)
fig,ax = plt.subplots(1)
# Display the image
ax.imshow(img_np)
# Draw box and label for each detection
for detect in image_boxes:
label = detect['label']
box = detect['box']
ymin, xmin, ymax, xmax = box['topY'], box['topX'], box['bottomY'], box['bottomX']
topleft_x, topleft_y = x * xmin, y * ymin
width, height = x * (xmax - xmin), y * (ymax - ymin)
print('{}: {}, {}, {}, {}'.format(detect['label'], topleft_x, topleft_y, width, height))
rect = patches.Rectangle((topleft_x, topleft_y), width, height,
linewidth=1, edgecolor='green', facecolor='none')
ax.add_patch(rect)
color = 'green'
plt.text(topleft_x, topleft_y, label, color=color)
plt.show()
%matplotlib inline
def display_detections(image, boxes, labels, scores, masks, resize_height,
resize_width, classes, score_threshold):
"""Visualize boxes and masks
:param image: raw image
:type image: PIL image
:param boxes: box with shape (No. of instances, 4)
:type boxes: ndarray
:param labels: classes with shape (No. of instances,)
:type labels: ndarray
:param scores: scores with shape (No. of instances,)
:type scores: ndarray
:param masks: masks with shape (No. of instances, 1, HEIGHT, WIDTH)
:type masks: ndarray
:param resize_height: expected height of an input image in onnx model
:type resize_height: Int
:param resize_width: expected width of an input image in onnx model
:type resize_width: Int
:param classes: classes with shape (No. of classes)
:type classes: list
:param score_threshold: threshold on scores in the range of 0-1
:type score_threshold: float
:return: None
_, ax = plt.subplots(1, figsize=(12,9))
image = np.array(image)
original_height = image.shape[0]
original_width = image.shape[1]
for mask, box, label, score in zip(masks, boxes, labels, scores):
if score <= score_threshold:
continue
mask = mask[0, :, :, None]
# resize boxes to original raw input size
box = [box[0]*original_width/resize_width,
box[1]*original_height/resize_height,
box[2]*original_width/resize_width,
box[3]*original_height/resize_height]
mask = cv2.resize(mask, (image.shape[1], image.shape[0]), 0, 0, interpolation = cv2.INTER_NEAREST)
# mask is a matrix with values in the range of [0,1]
# higher values indicate presence of object and vice versa
# select threshold or cut-off value to get objects present
mask = mask > score_threshold
image_masked = image.copy()
image_masked[mask] = (0, 255, 255)
alpha = 0.5 # alpha blending with range 0 to 1
cv2.addWeighted(image_masked, alpha, image, 1 - alpha,0, image)
rect = patches.Rectangle((box[0], box[1]), box[2] - box[0], box[3] - box[1],\
linewidth=1, edgecolor='b', facecolor='none')
ax.annotate(classes[label] + ':' + str(np.round(score, 2)), (box[0], box[1]),\
color='w', fontsize=12)
ax.add_patch(rect)
ax.imshow(image)
plt.show()
score_threshold = 0.5
img = Image.open(image_files[1]) # replace with desired image index
image_boxes = filtered_boxes_batch[1] # replace with desired image index
boxes, labels, scores, masks = predictions[4:8] # replace with desired image index
display_detections(img, boxes.copy(), labels, scores, masks.copy(),
height_onnx, width_onnx, classes, score_threshold)
详细了解 AutoML 中的计算机视觉任务
对 AutoML 实验进行故障排除