潇洒的荒野 · 沪2024年高中阶段学校招生最低投档控制分数 ...· 5 月前 · |
鬼畜的领结 · 两个有趣的日本动漫番回顾,有人感兴趣吗? - 知乎· 1 年前 · |
温文尔雅的橙子 · Docker学习笔记之保存和共享镜像-腾讯云 ...· 1 年前 · |
年轻有为的领带 · 配置Windows Defender ...· 1 年前 · |
豪情万千的手链 · 王淑英总领事出席北爱尔兰华商总会新春晚宴· 1 年前 · |
本人前段时间用yolov5进行目标检测研究,记录一下流程方便查看,也希望能帮助刚入门的人快速利用yolov5进行研究。
(1)https://blog.csdn.net/oJiWuXuan/article/details/107558286?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-3.channel_param&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-3.channel_param
(2)https://blog.csdn.net/sihaiyinan/article/details/89417963?ops_request_misc=%257B%2522request%255Fid%2522%253A%2522159149151419724846405391%2522%252C%2522scm%2522%253A%252220140713.130102334…%2522%257D&request_id=159149151419724846405391&biz_id=0&utm_medium=distribute.pc_search_result.none-task-blog-2
all
top_click~default-1-89417963.ecpm_v1_rank_ctr_v3&utm_term=voc_eval
为了方便,可以直接下载本人调试过的代码,这些代码在原有代码上有增加,如mAP的计算。
百度网盘:
链接:https://pan.baidu.com/s/1sbqoA5-xY3z5bZIItwwO5g
提取码:0ved
也可以下载GitHub上的原始代码:
代码下载: https://github.com/ultralytics/yolov5
下载权重:
我下载的是yolov5s.pt,并将该权重保存到 yolov5\weights中。
在yolov5\data目录下新建Annotations, ImageSets, labels 三个文件夹。
先将images文件夹清空,然后将用于训练的图片放入images中,将对应的xml文件放入Annotations中,如下图所示。
在yolov5根目录下创建makeTxt.py,代码如下:
import os import random trainval_percent = 0.03 train_percent = 1.0 xmlfilepath = './data/Annotations' total_xml = os.listdir(xmlfilepath) num = len(total_xml) list = range(num) tv = int(num * trainval_percent) trainval = random.sample(list, tv) txt_train='./data/ImageSets/train.txt' if os.path.exists(txt_train): os.remove(txt_train) else: open(txt_train,'w') txt_val='./data/ImageSets/val.txt' if os.path.exists(txt_val): os.remove(txt_val) else: open(txt_val,'w') ftrain = open(txt_train, 'w') fval = open(txt_val, 'w') for i in list: name = total_xml[i][:-4] + '\n' ftrain.write(name) if i in trainval: fval.write(name) ftrain.close() fval.close()
makeTxt.py主要是将数据集分类成训练数据集和验证数据集,运行后ImagesSets文件夹中会生成2个文件,用于记录训练数据集和验证数据集的图片名称,如下图所示。
因为本人研究需要,对makeTxt.py作了修改。 原来的makeTxt.py代码请参考
https://blog.csdn.net/oJiWuXuan/article/details/107558286?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-3.channel_param&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-3.channel_param
voc_label.py
接着再新建另一个文件voc_label.py,切记,classes=[……] 中填入的一定要是自己在数据集中所标注的类别名称,标记了几个类别就填写几个类别名,填写错误的话会造成读取不出xml文件里的标注信息。代码如下:
# -*- coding: utf-8 -*- # xml解析包 import xml.etree.ElementTree as ET import os from os import getcwd import shutil sets = ['train', 'val'] classes = ['combustion_lining', 'fan', 'fan_stator_casing_and_support', 'hp_core_casing', 'hpc_spool', 'hpc_stage_5', 'mixer', 'nozzle', 'nozzle_cone', 'stand'] style = '.png' # 进行归一化操作 def convert(size, box): # size:(原图w,原图h) , box:(xmin,xmax,ymin,ymax) dw = 1. / size[0] # 1/w dh = 1. / size[1] # 1/h x = (box[0] + box[1]) / 2.0 # 物体在图中的中心点x坐标 y = (box[2] + box[3]) / 2.0 # 物体在图中的中心点y坐标 w = box[1] - box[0] # 物体实际像素宽度 h = box[3] - box[2] # 物体实际像素高度 x = x * dw # 物体中心点x的坐标比(相当于 x/原图w) w = w * dw # 物体宽度的宽度比(相当于 w/原图w) y = y * dh # 物体中心点y的坐标比(相当于 y/原图h) h = h * dh # 物体宽度的宽度比(相当于 h/原图h) return (x, y, w, h) # 返回 相对于原图的物体中心点的x坐标比,y坐标比,宽度比,高度比,取值范围[0-1] # year ='2012', 对应图片的id(文件名) def convert_annotation(image_id): 将对应文件名的xml文件转化为label文件,xml文件包含了对应的bunding框以及图片长款大小等信息, 通过对其解析,然后进行归一化最终读到label文件中去,也就是说 一张图片文件对应一个xml文件,然后通过解析和归一化,能够将对应的信息保存到唯一一个label文件中去 labal文件中的格式:calss x y w h 同时,一张图片对应的类别有多个,所以对应的bunding的信息也有多个 # 对应的通过year 找到相应的文件夹,并且打开相应image_id的xml文件,其对应bund文件 in_file = open('./data/Annotations/%s.xml' % (image_id), encoding='utf-8') # 准备在对应的image_id 中写入对应的label,分别为 # <object-class> <x> <y> <width> <height> out_file = open('./data/labels/%s.txt' % (image_id), 'w', encoding='utf-8') # 解析xml文件 tree = ET.parse(in_file) # 获得对应的键值对 root = tree.getroot() # 获得图片的尺寸大小 size = root.find('size') # 如果xml内的标记为空,增加判断条件 if size != None: # 获得宽 w = int(size.find('width').text) # 获得高 h = int(size.find('height').text) # 遍历目标obj for obj in root.iter('object'): # 获得difficult ?? difficult = obj.find('difficult').text # 获得类别 =string 类型 cls = obj.find('name').text # 如果类别不是对应在我们预定好的class文件中,或difficult==1则跳过 if cls not in classes or int(difficult) == 1: continue # 通过类别名称找到id cls_id = classes.index(cls) # 找到bndbox 对象 xmlbox = obj.find('bndbox') # 获取对应的bndbox的数组 = ['xmin','xmax','ymin','ymax'] b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text)) print(image_id, cls, b) # 带入进行归一化操作 # w = 宽, h = 高, b= bndbox的数组 = ['xmin','xmax','ymin','ymax'] bb = convert((w, h), b) # bb 对应的是归一化后的(x,y,w,h) # 生成 calss x y w h 在label文件中 out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n') # 返回当前工作目录 wd = getcwd() print(wd) # 先找labels文件夹如果不存在则创建 labels = './data/labels' if os.path.exists(labels): shutil.rmtree(labels) # delete output folder os.makedirs(labels) # make new output folder for image_set in sets: 对所有的文件数据集进行遍历 做了两个工作: 1.将所有图片文件都遍历一遍,并且将其所有的全路径都写在对应的txt文件中去,方便定位 2.同时对所有的图片文件进行解析和转化,将其对应的bundingbox 以及类别的信息全部解析写到label 文件中去 最后再通过直接读取文件,就能找到对应的label 信息 # 读取在ImageSets/Main 中的train、test..等文件的内容 # 包含对应的文件名称 image_ids = open('./data/ImageSets/%s.txt' % (image_set)).read().strip().split() # 打开对应的2012_train.txt 文件对其进行写入准备 txt_name = './data/%s.txt' % (image_set) if os.path.exists(txt_name): os.remove(txt_name) else: open(txt_name, 'w') list_file = open(txt_name, 'w') # 将对应的文件_id以及全路径写进去并换行 for image_id in image_ids: list_file.write('data/images/%s%s\n' % (image_id, style)) # 调用 year = 年份 image_id = 对应的文件名_id convert_annotation(image_id) # 关闭文件 list_file.close()
voc_label.py主要是将图片数据集标注后的xml文件中的标注信息读取出来并写入txt文件,运行后在labels文件夹中生成所有图片数据集的标注信息,如下图:
同时在data文件夹下生成train和val两个txt文件。
到此,本次训练所需的数据集已经全部准备好了。
数据集方面的yaml文件修改
首先在data目录下,新建object.yaml,并对object.yaml中的参数进行配置。其中train,val后面分别为训练集和验证集图片的路径, nc为数据集的类别数(我的为10类),names: 换成自己的类别名称。
代码如下:# COCO 2017 dataset http://cocodataset.org # Download command: bash yolov5/data/get_coco2017.sh # Train command: python train.py --data ./data/coco.yaml # Dataset should be placed next to yolov5 folder: # /parent_folder # /coco # /yolov5 # train and val datasets (image directory or *.txt file with image paths) train: data/train.txt # 118k images val: data/val.txt # 5k images #test: data/test.txt # 20k images for submission to https://competitions.codalab.org/competitions/20794 # number of classes nc: 10 # class names names: ['combustion_lining', 'fan', 'fan_stator_casing_and_support', 'hp_core_casing', 'hpc_spool', 'hpc_stage_5', 'mixer', 'nozzle', 'nozzle_cone', 'stand'] # Print classes # with open('data/coco.yaml') as f: # d = yaml.load(f, Loader=yaml.FullLoader) # dict # for i, x in enumerate(d['names']): # print(i, x)
网络参数方面的yaml文件修改
在yolov5\models目录下,选择一个模型,我用的是yolov5s.yaml文件,修改该文件,只需要修改nc。我的是10类。
# parameters nc: 10 # number of classes depth_multiple: 0.33 # model depth multiple width_multiple: 0.50 # layer channel multiple
train.py中的一些参数修改
最后,在根目录中对train.py中的部分参数进行修改,batch-size和workers根据自己电脑的性能进行设置,如下所示:
parser = argparse.ArgumentParser() parser.add_argument('--weights', type=str, default='weights/yolov5s.pt', help='initial weights path') parser.add_argument('--cfg', type=str, default='models/yolov5s.yaml', help='model.yaml path') parser.add_argument('--data', type=str, default='data/object.yaml', help='data.yaml path') parser.add_argument('--hyp', type=str, default='data/hyp.scratch.yaml', help='hyperparameters path') parser.add_argument('--epochs', type=int, default=300) parser.add_argument('--batch-size', type=int, default=4, help='total batch size for all GPUs') parser.add_argument('--workers', type=int, default=2, help='maximum number of dataloader workers')
全部配置好后,直接执行train.py文件开始训练。训练好后会在yolov5\runs\train\exp文件夹得到如下文件:
其中best.pt是epoch次训练中得到的最好的一个权重,last.pt是最后一次训练所得的权重。在训练过程中,运行yolov5\loss_line文件夹中的criterion_line.py,可以实时查看loss等曲线:
loss_line文件夹需要自己创建。criterion_line.py代码:
import matplotlib.pyplot as plt import numpy as np with open('../runs/train/exp/results.txt', 'r') as out_data: text = out_data.readlines() # 结果为str类型 loss = [] for ss in text: ss = ss.strip() ss = ss.split() strr = ss[2:6] + ss[8:12] numbers = list(map(float, strr)) loss.append(numbers) # 0-GIoU, 1-obj, 2-cls, 3-total, 4-P, 5-R, 6-mAP@.5, 7-mAP@.5:.95 loss = np.array(loss) epoch_n = len(loss) x = np.linspace(1, epoch_n, epoch_n) GIoU = loss[:, 0] obj = loss[:, 1] cls = loss[:, 2] total = loss[:, 3] P = loss[:, 4] R = loss[:, 5] mAP_5 = loss[:, 6] mAP_5_95 = loss[:, 7] plt.figure(num=1, figsize=(16, 10), ) plt.subplot(4, 2, 1) plt.plot(x, GIoU, color='red', linewidth=1.0, linestyle='--', label='GIoU') plt.legend(loc='upper right') plt.subplot(4, 2, 2) plt.plot(x, obj, color='red', linewidth=1.0, linestyle='--', label='obj') plt.legend(loc='upper right') plt.subplot(4, 2, 3) plt.plot(x, cls, color='red', linewidth=1.0, linestyle='--', label='cls') plt.legend(loc='upper right') plt.subplot(4, 2, 4) plt.plot(x, total, color='red', linewidth=1.0, linestyle='--', label='total') plt.legend(loc='upper right') plt.subplot(4, 2, 5) plt.plot(x, P, color='red', linewidth=1.0, linestyle='--', label='P') plt.legend(loc='upper right') plt.subplot(4, 2, 6) plt.plot(x, R, color='red', linewidth=1.0, linestyle='--', label='R') plt.legend(loc='upper right') plt.subplot(4, 2, 7) plt.plot(x, mAP_5, color='red', linewidth=1.0, linestyle='--', label='mAP_5') plt.legend(loc='upper right') plt.subplot(4, 2, 8) plt.plot(x, mAP_5_95, color='red', linewidth=1.0, linestyle='--', label='mAP_5_95') plt.legend(loc='upper right') plt.show()
结果图片:
新建data_test
在yolov5中新建data_test文件夹,在该文件夹中新建5个文件夹和一个txt文件,如下所示。
并将测试集中图片放在JPEGImages_manual文件夹中,将对应的xml文件放在Annotations_manual中。
新建几个py文件
在yolov5中新建mAP文件夹,并新建cfg_mAP.py,detect_eval_class_txt.py,compute_mAP.py,mAP_line.py,utils_mAP.py和yolov5_eval.py
代码分别如下:cfg_mAP.py:
# -*- coding: utf-8 -*- import os from easydict import EasyDict Cfg = EasyDict() Cfg.names = ['combustion_lining', 'fan', 'fan_stator_casing_and_support', 'hp_core_casing', 'hpc_spool', 'hpc_stage_5', 'mixer', 'nozzle', 'nozzle_cone', 'stand'] # 由于原对象的名字太长,绘制在图片上显得很杂乱,所以将名字简写。 Cfg.textnames = ['combustion', 'fan', 'stator', 'core', 'spool', 'stage', 'mixer', 'nozzle', 'cone', 'stand'] Cfg.device = '0,1' # manual Cfg.origimgs_filepath = '../data_test/JPEGImages_manual' Cfg.testimgs_filepath = '../data_test/JPEGImages_manual' Cfg.eval_classtxt_path = '../data_test/class_txt_manual/' Cfg.eval_Annotations_path = '../data_test/Annotations_manual' Cfg.eval_imgs_name_txt = '../data_test/imgs_name_manual.txt' Cfg.cachedir = '../data_test/cachedir_manual/' Cfg.prediction_path = '../data_test/predictions_manual' # mAP_line cachedir Cfg.systhesis_valid_cachedir = '../data_test/cachedir_systhesis_valid/' Cfg.manual_cachedir = '../data_test/cachedir_manual/'
detect_eval_class_txt.py :
import argparse import os import platform import shutil import time from pathlib import Path import cv2 import torch import torch.backends.cudnn as cudnn from numpy import random from models.experimental import attempt_load from utils.datasets import LoadStreams, LoadImages from utils.general import ( check_img_size, non_max_suppression, apply_classifier, scale_coords, xyxy2xywh, plot_one_box, strip_optimizer, set_logging) from utils.torch_utils import select_device, load_classifier, time_synchronized from cfg_mAP import Cfg cfg = Cfg def detect(save_img=False): out, source, weights, view_img, save_txt, imgsz = \ opt.output, opt.source, opt.weights, opt.view_img, opt.save_txt, opt.img_size webcam = source == '0' or source.startswith('rtsp') or source.startswith('http') or source.endswith('.txt') # Initialize set_logging() device = select_device(opt.device) if os.path.exists(out): shutil.rmtree(out) # delete output folder os.makedirs(out) # make new output folder half = device.type != 'cpu' # half precision only supported on CUDA # Load model model = attempt_load(weights, map_location=device) # load FP32 model imgsz = check_img_size(imgsz, s=model.stride.max()) # check img_size if half: model.half() # to FP16 # Second-stage classifier classify = False if classify: modelc = load_classifier(name='resnet101', n=2) # initialize modelc.load_state_dict(torch.load('weights/resnet101.pt', map_location=device)['model']) # load weights modelc.to(device).eval() # Set Dataloader vid_path, vid_writer = None, None if webcam: view_img = True cudnn.benchmark = True # set True to speed up constant image size inference dataset = LoadStreams(source, img_size=imgsz) else: save_img = True dataset = LoadImages(source, img_size=imgsz) # Get names and colors names = model.module.names if hasattr(model, 'module') else model.names colors = [[random.randint(0, 255) for _ in range(3)] for _ in range(len(names))] # Run inference t0 = time.time() img = torch.zeros((1, 3, imgsz, imgsz), device=device) # init img _ = model(img.half() if half else img) if device.type != 'cpu' else None # run once test_time=[] for path, img, im0s, vid_cap in dataset: # Inference t1 = time_synchronized() img = torch.from_numpy(img).to(device) img = img.half() if half else img.float() # uint8 to fp16/32 img /= 255.0 # 0 - 255 to 0.0 - 1.0 if img.ndimension() == 3: img = img.unsqueeze(0) # # Inference # t1 = time_synchronized() pred = model(img, augment=opt.augment)[0] # Apply NMS pred = non_max_suppression(pred, opt.conf_thres, opt.iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms) t2 = time_synchronized() # Apply Classifier if classify: pred = apply_classifier(pred, modelc, img, im0s) # Process detections for i, det in enumerate(pred): # detections per image if webcam: # batch_size >= 1 p, s, im0 = path[i], '%g: ' % i, im0s[i].copy() else: p, s, im0 = path, '', im0s img_name = Path(p).name txt = open(opt.eval_imgs_name_txt, 'a') txt.write(img_name[:-4]) txt.write('\n') txt.close() save_path = str(Path(out) / Path(p).name) txt_path = str(Path(out) / Path(p).stem) + ('_%g' % dataset.frame if dataset.mode == 'video' else '') s += '%gx%g ' % img.shape[2:] # print string gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh if det is not None and len(det): # Rescale boxes from img_size to im0 size det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round() # Print results for c in det[:, -1].unique(): n = (det[:, -1] == c).sum() # detections per class s += '%g %ss, ' % (n, names[int(c)]) # add to string # Write results for *xyxy, conf, cls in reversed(det): txt = open(opt.eval_classtxt_path + '/%s' % names[int(cls)], 'a') obj_conf = conf.cpu().numpy() xyxy = torch.tensor(xyxy).numpy() x1 = xyxy[0] y1 = xyxy[1] x2 = xyxy[2] y2 = xyxy[3] new_box = [img_name[:-4], obj_conf, x1, y1, x2, y2] txt.write(" ".join([str(a) for a in new_box])) txt.write('\n') txt.close() if save_txt: # Write to file xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh with open(txt_path + '.txt', 'a') as f: f.write(('%g ' * 5 + '\n') % (cls, *xywh)) # label format if save_img or view_img: # Add bbox to image label = '%s %.2f' % (cfg.textnames[int(cls)], conf) plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3) test_time.append(t2 - t1) # Print time (inference + NMS) print('%sDone. (%.3fs)' % (s, t2 - t1)) # Stream results if view_img: cv2.imshow(p, im0) if cv2.waitKey(1) == ord('q'): # q to quit raise StopIteration # Save results (image with detections) if save_img: if dataset.mode == 'images': cv2.imwrite(save_path, im0) else: if vid_path != save_path: # new video vid_path = save_path if isinstance(vid_writer, cv2.VideoWriter): vid_writer.release() # release previous video writer fourcc = 'mp4v' # output video codec fps = vid_cap.get(cv2.CAP_PROP_FPS) w = int(vid_cap.get(cv2.CAP_PROP_FRAME_WIDTH)) h = int(vid_cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) vid_writer = cv2.VideoWriter(save_path, cv2.VideoWriter_fourcc(*fourcc), fps, (w, h)) vid_writer.write(im0) if save_txt or save_img: print('Results saved to %s' % Path(out)) if platform.system() == 'Darwin' and not opt.update: # MacOS os.system('open ' + save_path) print('Done. (%.3fs)' % (time.time() - t0)) mean_time=sum(test_time)/len(test_time) print('mean time:', mean_time) print('frame: ', 1/mean_time) if __name__ == '__main__': dir = '../data_test/imgs_name_manual.txt' if os.path.exists(dir): os.remove(dir) else: open(dir, 'w') predictions_manual='../data_test/predictions_manual' class_txt_manual='../data_test/class_txt_manual' cachedir_manual='../data_test/cachedir_manual' if os.path.exists(predictions_manual): shutil.rmtree(predictions_manual) # delete output folder os.makedirs(predictions_manual) # make new output folder if os.path.exists(class_txt_manual): shutil.rmtree(class_txt_manual) # delete output folder os.makedirs(class_txt_manual) # make new output folder if os.path.exists(cachedir_manual): shutil.rmtree(cachedir_manual) # delete output folder os.makedirs(cachedir_manual) # make new output folder parser = argparse.ArgumentParser() parser.add_argument('--weights', nargs='+', type=str, default='../runs/train/exp/weights/last.pt', help='model.pt path(s)') parser.add_argument('--source', type=str, default='../data_test/JPEGImages_manual', help='source') # file/folder, 0 for webcam parser.add_argument('--output', type=str, default='../data_test/predictions_manual', help='output folder') # output folder parser.add_argument('--eval_imgs_name_txt', type=str, default='../data_test/imgs_name_manual.txt', help='output folder') # output folder parser.add_argument('--eval_classtxt_path', type=str, default='../data_test/class_txt_manual', help='output folder') # output folder parser.add_argument('--img-size', type=int, default=640, help='inference size (pixels)') parser.add_argument('--conf-thres', type=float, default=0.4, help='object confidence threshold') parser.add_argument('--iou-thres', type=float, default=0.5, help='IOU threshold for NMS') parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu') parser.add_argument('--view-img', action='store_true', help='display results') parser.add_argument('--save-txt', action='store_true', help='save results to *.txt') parser.add_argument('--classes', nargs='+', type=int, help='filter by class: --class 0, or --class 0 2 3') parser.add_argument('--agnostic-nms', action='store_true', help='class-agnostic NMS') parser.add_argument('--augment', action='store_true', help='augmented inference') parser.add_argument('--update', action='store_true', help='update all models') opt = parser.parse_args() print(opt) with torch.no_grad(): if opt.update: # update all models (to fix SourceChangeWarning) for opt.weights in ['yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt']: detect() strip_optimizer(opt.weights) else: detect()
此外需要将plot_one_box函数放入yolov5\utils\general.py中。
def plot_one_box(x, img, color=None, label=None, line_thickness=None): # Plots one bounding box on image img tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1 # line/font thickness color = color or [random.randint(0, 255) for _ in range(3)] c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3])) cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA) if label: tf = max(tl - 1, 1) # font thickness t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0] c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3 cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA) # filled cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
compute_mAP.py:
# -*- coding: utf-8 -*- import os import numpy as np from yolov5_eval import yolov5_eval # 注意将yolov4_eval.py和compute_mAP.py放在同一级目录下 from cfg_mAP import Cfg import pickle import shutil cfg = Cfg eval_classtxt_path = cfg.eval_classtxt_path # 各类txt文件路径 eval_classtxt_files = os.listdir(eval_classtxt_path) classes = cfg.names # ['combustion_lining', 'fan', 'fan_stator_casing_and_support', 'hp_core_casing', 'hpc_spool', 'hpc_stage_5','mixer', 'nozzle', 'nozzle_cone', 'stand'] aps = [] # 保存各类ap cls_rec = {} # 保存recall cls_prec = {} # 保存精度 cls_ap = {} annopath = cfg.eval_Annotations_path + '/{:s}.xml' # annotations的路径,{:s}.xml方便后面根据图像名字读取对应的xml文件 imagesetfile = cfg.eval_imgs_name_txt # 读取图像名字列表文件 cachedir = cfg.cachedir if os.path.exists(cachedir): shutil.rmtree(cachedir) # delete output folder os.makedirs(cachedir) # make new output folder for cls in eval_classtxt_files: # 读取cls类对应的txt文件 filename = eval_classtxt_path + cls rec, prec, ap = yolov5_eval( # yolov4_eval.py计算cls类的recall precision ap filename, annopath, imagesetfile, cls, cachedir, ovthresh=0.5, use_07_metric=False) aps += [ap] cls_ap[cls] = ap cls_rec[cls] = rec[-1] cls_prec[cls] = prec[-1] print('AP for {} = {:.4f}'.format(cls, ap)) print('recall for {} = {:.4f}'.format(cls, rec[-1])) print('precision for {} = {:.4f}'.format(cls, prec[-1])) with open(os.path.join(cfg.cachedir, 'cls_ap.pkl'), 'wb') as in_data: pickle.dump(cls_ap, in_data, pickle.HIGHEST_PROTOCOL) with open(os.path.join(cfg.cachedir, 'cls_rec.pkl'), 'wb') as in_data: pickle.dump(cls_rec, in_data, pickle.HIGHEST_PROTOCOL) with open(os.path.join(cfg.cachedir, 'cls_prec.pkl'), 'wb') as in_data: pickle.dump(cls_prec, in_data, pickle.HIGHEST_PROTOCOL) print('Mean AP = {:.4f}'.format(np.mean(aps))) print('~~~~~~~~') print('Results:') for ap in aps: print('{:.3f}'.format(ap)) print('~~~~~~~~') print('{:.3f}'.format(np.mean(aps))) print('~~~~~~~~')
mAP_line.py :
import os import matplotlib.pyplot as plt import numpy as np import pickle from cfg_mAP import Cfg cfg = Cfg x = np.linspace(1, 10, 10) ap_systhesis_valid = [] ap_manual = [] plt.figure(num=1, figsize=(8, 5), ) with open(os.path.join(cfg.manual_cachedir, 'cls_ap.pkl'), 'rb') as out_data: # 按保存变量的顺序加载变量 manual_cls_ap = pickle.load(out_data) print(manual_cls_ap) # dataList print(len(manual_cls_ap)) # dataList for cls in cfg.names: if cls in manual_cls_ap.keys(): ap_manual.append(manual_cls_ap[cls]) else: ap_manual.append(0.0) print('ap_manual: ', ap_manual) manual_mAP = np.mean(ap_manual) l2, = plt.plot(x, ap_manual, color='k', linewidth=1.0, linestyle='-.', label='manual_AP') plt.scatter(x, ap_manual, s=10, color='k') for x1, y1 in zip(x, ap_manual): plt.text(x1, y1, '%s' % str('{0:.3f}'.format(y1)), fontdict={'fontsize': 14}, verticalalignment="bottom", horizontalalignment="center") plt.annotate(r'manual_mAP=%s' % str('{0:.3f}'.format(manual_mAP)), xy=(5, manual_mAP), xycoords='data', xytext=(0.0, 0.0), textcoords='offset points', fontsize=13, ) plt.xticks(np.linspace(1, 10, 10), [r'combustion_lining', r'fan', r'fan_support', r'hp_core_casing', r'hpc_spool', r'hpc_stage5', r'mixer', r'nozzle', r'nozzle_cone', r'stand']) plt.legend(handles=[l2], loc='best') plt.show()
utils_mAP.py:
import sys import os import time import math import torch import numpy as np from PIL import Image, ImageDraw, ImageFont from torch.autograd import Variable import itertools import struct # get_image_size import imghdr # get_image_size def sigmoid(x): return 1.0 / (np.exp(-x) + 1.) def softmax(x): x = np.exp(x - np.expand_dims(np.max(x, axis=1), axis=1)) x = x / np.expand_dims(x.sum(axis=1), axis=1) return x def bbox_iou(box1, box2, x1y1x2y2=True): if x1y1x2y2: mx = min(box1[0], box2[0]) Mx = max(box1[2], box2[2]) my = min(box1[1], box2[1]) My = max(box1[3], box2[3]) w1 = box1[2] - box1[0] h1 = box1[3] - box1[1] w2 = box2[2] - box2[0] h2 = box2[3] - box2[1] else: mx = min(box1[0] - box1[2] / 2.0, box2[0] - box2[2] / 2.0) Mx = max(box1[0] + box1[2] / 2.0, box2[0] + box2[2] / 2.0) my = min(box1[1] - box1[3] / 2.0, box2[1] - box2[3] / 2.0) My = max(box1[1] + box1[3] / 2.0, box2[1] + box2[3] / 2.0) w1 = box1[2] h1 = box1[3] w2 = box2[2] h2 = box2[3] uw = Mx - mx uh = My - my cw = w1 + w2 - uw ch = h1 + h2 - uh carea = 0 if cw <= 0 or ch <= 0: return 0.0 area1 = w1 * h1 area2 = w2 * h2 carea = cw * ch uarea = area1 + area2 - carea return carea / uarea def bbox_ious(boxes1, boxes2, x1y1x2y2=True): if x1y1x2y2: mx = torch.min(boxes1[0], boxes2[0]) Mx = torch.max(boxes1[2], boxes2[2]) my = torch.min(boxes1[1], boxes2[1]) My = torch.max(boxes1[3], boxes2[3]) w1 = boxes1[2] - boxes1[0] h1 = boxes1[3] - boxes1[1] w2 = boxes2[2] - boxes2[0] h2 = boxes2[3] - boxes2[1] else: mx = torch.min(boxes1[0] - boxes1[2] / 2.0, boxes2[0] - boxes2[2] / 2.0) Mx = torch.max(boxes1[0] + boxes1[2] / 2.0, boxes2[0] + boxes2[2] / 2.0) my = torch.min(boxes1[1] - boxes1[3] / 2.0, boxes2[1] - boxes2[3] / 2.0) My = torch.max(boxes1[1] + boxes1[3] / 2.0, boxes2[1] + boxes2[3] / 2.0) w1 = boxes1[2] h1 = boxes1[3] w2 = boxes2[2] h2 = boxes2[3] uw = Mx - mx uh = My - my cw = w1 + w2 - uw ch = h1 + h2 - uh mask = ((cw <= 0) + (ch <= 0) > 0) area1 = w1 * h1 area2 = w2 * h2 carea = cw * ch carea[mask] = 0 uarea = area1 + area2 - carea return carea / uarea def nms(_boxes, _nms_thresh): if len(_boxes) == 0: return _boxes det_confs = torch.zeros(len(_boxes)) for i in range(len(_boxes)): det_confs[i] = 1 - _boxes[i][4] _, sortIds = torch.sort(det_confs) out_boxes = [] for i in range(len(_boxes)): box_i = _boxes[sortIds[i]] if box_i[4] > 0: out_boxes.append(box_i) for j in range(i + 1, len(_boxes)): box_j = _boxes[sortIds[j]] if bbox_iou(box_i, box_j, x1y1x2y2=False) > _nms_thresh: # print(box_i, box_j, bbox_iou(box_i, box_j, x1y1x2y2=False)) box_j[4] = 0 return out_boxes def convert2cpu(gpu_matrix): return torch.FloatTensor(gpu_matrix.size()).copy_(gpu_matrix) def convert2cpu_long(gpu_matrix): return torch.LongTensor(gpu_matrix.size()).copy_(gpu_matrix) def get_region_boxes_in_model(output, conf_thresh, num_classes, anchors, num_anchors, only_objectness=1, validation=False): anchor_step = len(anchors) // num_anchors if output.dim() == 3: output = output.unsqueeze(0) batch = output.size(0) assert (output.size(1) == (5 + num_classes) * num_anchors) h = output.size(2) w = output.size(3) t0 = time.time() all_boxes = [] output = output.view(batch * num_anchors, 5 + num_classes, h * w).transpose(0, 1).contiguous().view(5 + num_classes, batch * num_anchors * h * w) grid_x = torch.linspace(0, w - 1, w).repeat(h, 1).repeat(batch * num_anchors, 1, 1).view( batch * num_anchors * h * w).type_as(output) # cuda() grid_y = torch.linspace(0, h - 1, h).repeat(w, 1).t().repeat(batch * num_anchors, 1, 1).view( batch * num_anchors * h * w).type_as(output) # cuda() xs = torch.sigmoid(output[0]) + grid_x ys = torch.sigmoid(output[1]) + grid_y anchor_w = torch.Tensor(anchors).view(num_anchors, anchor_step).index_select(1, torch.LongTensor([0])) anchor_h = torch.Tensor(anchors).view(num_anchors, anchor_step).index_select(1, torch.LongTensor([1])) anchor_w = anchor_w.repeat(batch, 1).repeat(1, 1, h * w).view(batch * num_anchors * h * w).type_as(output) # cuda() anchor_h = anchor_h.repeat(batch, 1).repeat(1, 1, h * w).view(batch * num_anchors * h * w).type_as(output) # cuda() ws = torch.exp(output[2]) * anchor_w hs = torch.exp(output[3]) * anchor_h det_confs = torch.sigmoid(output[4]) cls_confs = torch.nn.Softmax()(Variable(output[5:5 + num_classes].transpose(0, 1))).data cls_max_confs, cls_max_ids = torch.max(cls_confs, 1) cls_max_confs = cls_max_confs.view(-1) cls_max_ids = cls_max_ids.view(-1) t1 = time.time() sz_hw = h * w sz_hwa = sz_hw * num_anchors det_confs = convert2cpu(det_confs) cls_max_confs = convert2cpu(cls_max_confs) cls_max_ids = convert2cpu_long(cls_max_ids) xs = convert2cpu(xs) ys = convert2cpu(ys) ws = convert2cpu(ws) hs = convert2cpu(hs) if validation: cls_confs = convert2cpu(cls_confs.view(-1, num_classes)) t2 = time.time() for b in range(batch): boxes = [] for cy in range(h): for cx in range(w): for i in range(num_anchors): ind = b * sz_hwa + i * sz_hw + cy * w + cx det_conf = det_confs[ind] if only_objectness: conf = det_confs[ind] else: conf = det_confs[ind] * cls_max_confs[ind] if conf > conf_thresh: bcx = xs[ind] bcy = ys[ind] bw = ws[ind] bh = hs[ind] cls_max_conf = cls_max_confs[ind] cls_max_id = cls_max_ids[ind] box = [bcx / w, bcy / h, bw / w, bh / h, det_conf, cls_max_conf, cls_max_id] if (not only_objectness) and validation: for c in range(num_classes): tmp_conf = cls_confs[ind][c] if c != cls_max_id and det_confs[ind] * tmp_conf > conf_thresh: box.append(tmp_conf) box.append(c) boxes.append(box) all_boxes.append(boxes) t3 = time.time() if False: print('---------------------------------') print('matrix computation : %f' % (t1 - t0)) print(' gpu to cpu : %f' % (t2 - t1)) print(' tpz filter : %f' % (t3 - t2)) print('---------------------------------') return all_boxes def get_region_boxes_out_model(_output, _cfg, _anchors, _num_anchors, _only_objectness=1, _validation=False): anchor_step = len(_anchors) // _num_anchors if len(_output.shape) == 3: _output = np.expand_dims(_output, axis=0) batch = _output.shape[0] assert (_output.shape[1] == (5 + _cfg.classes) * _num_anchors) h = _output.shape[2] w = _output.shape[3] t0 = time.time() all_boxes = [] _output = _output.reshape(batch * _num_anchors, 5 + _cfg.classes, h * w).transpose((1, 0, 2)).reshape( 5 + _cfg.classes, batch * _num_anchors * h * w) grid_x = np.expand_dims(np.expand_dims(np.linspace(0, w - 1, w), axis=0).repeat(h, 0), axis=0).repeat( batch * _num_anchors, axis=0).reshape( batch * _num_anchors * h * w) grid_y = np.expand_dims(np.expand_dims(np.linspace(0, h - 1, h), axis=0).repeat(w, 0).T, axis=0).repeat( batch * _num_anchors, axis=0).reshape( batch * _num_anchors * h * w) xs = sigmoid(_output[0]) + grid_x ys = sigmoid(_output[1]) + grid_y anchor_w = np.array(_anchors).reshape((_num_anchors, anchor_step))[:, 0] anchor_h = np.array(_anchors).reshape((_num_anchors, anchor_step))[:, 1] anchor_w = np.expand_dims(np.expand_dims(anchor_w, axis=1).repeat(batch, 1), axis=2) \ .repeat(h * w, axis=2).transpose(1, 0, 2).reshape(batch * _num_anchors * h * w) anchor_h = np.expand_dims(np.expand_dims(anchor_h, axis=1).repeat(batch, 1), axis=2) \ .repeat(h * w, axis=2).transpose(1, 0, 2).reshape(batch * _num_anchors * h * w) ws = np.exp(_output[2]) * anchor_w hs = np.exp(_output[3]) * anchor_h det_confs = sigmoid(_output[4]) cls_confs = softmax(_output[5:5 + _cfg.classes].transpose(1, 0)) cls_max_confs = np.max(cls_confs, 1) cls_max_ids = np.argmax(cls_confs, 1) t1 = time.time() sz_hw = h * w sz_hwa = sz_hw * _num_anchors t2 = time.time() for b in range(batch): boxes = [] for cy in range(h): for cx in range(w): for i in range(_num_anchors): ind = b * sz_hwa + i * sz_hw + cy * w + cx det_conf = det_confs[ind] if _only_objectness: conf = det_confs[ind] else: conf = det_confs[ind] * cls_max_confs[ind] if conf > _cfg.conf_thresh: bcx = xs[ind] bcy = ys[ind] bw = ws[ind] bh = hs[ind] cls_max_conf = cls_max_confs[ind] cls_max_id = cls_max_ids[ind] box = [bcx / w, bcy / h, bw / w, bh / h, det_conf, cls_max_conf, cls_max_id] if (not _only_objectness) and _validation: for c in range(_cfg.classes): tmp_conf = cls_confs[ind][c] if c != cls_max_id and det_confs[ind] * tmp_conf > _cfg.conf_thresh: box.append(tmp_conf) box.append(c) boxes.append(box) all_boxes.append(boxes) t3 = time.time() if False: print('---------------------------------') print('matrix computation : %f' % (t1 - t0)) print(' gpu to cpu : %f' % (t2 - t1)) print(' tpz filter : %f' % (t3 - t2)) print('---------------------------------') return all_boxes def get_classtxt_out_model(_output, _cfg, _anchors, _num_anchors, _only_objectness=1, _validation=False): anchor_step = len(_anchors) // _num_anchors if len(_output.shape) == 3: _output = np.expand_dims(_output, axis=0) batch = _output.shape[0] assert (_output.shape[1] == (5 + _cfg.n_classes) * _num_anchors) h = _output.shape[2] w = _output.shape[3] t0 = time.time() all_boxes = [] _output = _output.reshape(batch * _num_anchors, 5 + _cfg.n_classes, h * w).transpose((1, 0, 2)).reshape( 5 + _cfg.n_classes, batch * _num_anchors * h * w) grid_x = np.expand_dims(np.expand_dims(np.linspace(0, w - 1, w), axis=0).repeat(h, 0), axis=0).repeat( batch * _num_anchors, axis=0).reshape( batch * _num_anchors * h * w) grid_y = np.expand_dims(np.expand_dims(np.linspace(0, h - 1, h), axis=0).repeat(w, 0).T, axis=0).repeat( batch * _num_anchors, axis=0).reshape( batch * _num_anchors * h * w) xs = sigmoid(_output[0]) + grid_x ys = sigmoid(_output[1]) + grid_y anchor_w = np.array(_anchors).reshape((_num_anchors, anchor_step))[:, 0] anchor_h = np.array(_anchors).reshape((_num_anchors, anchor_step))[:, 1] anchor_w = np.expand_dims(np.expand_dims(anchor_w, axis=1).repeat(batch, 1), axis=2) \ .repeat(h * w, axis=2).transpose(1, 0, 2).reshape(batch * _num_anchors * h * w) anchor_h = np.expand_dims(np.expand_dims(anchor_h, axis=1).repeat(batch, 1), axis=2) \ .repeat(h * w, axis=2).transpose(1, 0, 2).reshape(batch * _num_anchors * h * w) ws = np.exp(_output[2]) * anchor_w hs = np.exp(_output[3]) * anchor_h det_confs = sigmoid(_output[4]) cls_confs = softmax(_output[5:5 + _cfg.n_classes].transpose(1, 0)) cls_max_confs = np.max(cls_confs, 1) cls_max_ids = np.argmax(cls_confs, 1) t1 = time.time() sz_hw = h * w sz_hwa = sz_hw * _num_anchors t2 = time.time() for b in range(batch): boxes = [] for cy in range(h): for cx in range(w): for i in range(_num_anchors): ind = b * sz_hwa + i * sz_hw + cy * w + cx det_conf = det_confs[ind] if _only_objectness: conf = det_confs[ind] else: conf = det_confs[ind] * cls_max_confs[ind] if conf > _cfg.conf_thresh: bcx = xs[ind] bcy = ys[ind] bw = ws[ind] bh = hs[ind] cls_max_conf = cls_max_confs[ind] cls_max_id = cls_max_ids[ind] box = [bcx / w, bcy / h, bw / w, bh / h, det_conf, cls_max_conf, cls_max_id] if (not _only_objectness) and _validation: for c in range(_cfg.classes): tmp_conf = cls_confs[ind][c] if c != cls_max_id and det_confs[ind] * tmp_conf > _cfg.conf_thresh: box.append(tmp_conf) box.append(c) boxes.append(box) all_boxes.append(boxes) t3 = time.time() if False: print('---------------------------------') print('matrix computation : %f' % (t1 - t0)) print(' gpu to cpu : %f' % (t2 - t1)) print(' tpz filter : %f' % (t3 - t2)) print('---------------------------------') return all_boxes def plot_boxes_cv2(img, boxes, savename=None, class_names=None, color=None): import cv2 colors = torch.FloatTensor([[1, 0, 1], [0, 0, 1], [0, 1, 1], [0, 1, 0], [1, 1, 0], [1, 0, 0]]); def get_color(c, x, max_val): ratio = float(x) / max_val * 5 i = int(math.floor(ratio)) j = int(math.ceil(ratio)) ratio = ratio - i r = (1 - ratio) * colors[i][c] + ratio * colors[j][c] return int(r * 255) width = img.shape[1] height = img.shape[0] for i in range(len(boxes)): box = boxes[i] x1 = int((box[0] - box[2] / 2.0) * width) y1 = int((box[1] - box[3] / 2.0) * height) x2 = int((box[0] + box[2] / 2.0) * width) y2 = int((box[1] + box[3] / 2.0) * height) if color: rgb = color else: rgb = (255, 0, 0) if len(box) >= 7 and class_names: cls_conf = box[5] cls_id = box[6] print('%s: %f' % (class_names[cls_id], cls_conf)) classes = len(class_names) offset = cls_id * 123457 % classes red = get_color(2, offset, classes) green = get_color(1, offset, classes) blue = get_color(0, offset, classes) if color is None: rgb = (red, green, blue) img = cv2.putText(img, class_names[cls_id], (x1, y1), cv2.FONT_HERSHEY_SIMPLEX, 1.2, rgb, 1) img = cv2.rectangle(img, (x1, y1), (x2, y2), rgb, 1) if savename: print("save plot results to %s" % savename) cv2.imwrite(savename, img) return img def plot_boxes(_img, _boxes, _savename=None, _class_names=None): font = ImageFont.truetype("consola.ttf", 40, encoding="unic") # 设置字体 colors = torch.FloatTensor([[1, 0, 1], [0, 0, 1], [0, 1, 1], [0, 1, 0], [1, 1, 0], [1, 0, 0]]); def get_color(c, x, max_val): ratio = float(x) / max_val * 5 i = int(math.floor(ratio)) j = int(math.ceil(ratio)) ratio = ratio - i r = (1 - ratio) * colors[i][c] + ratio * colors[j][c] return int(r * 255) # width = _img.shape[1] # height = _img.shape[0] draw = ImageDraw.Draw(_img) for i in range(len(_boxes)): box = _boxes[i] x1 = box[0] y1 = box[1] x2 = box[2] y2 = box[3] rgb = (255, 0, 0) if len(box) >= 7 and _class_names: cls_conf = box[5] cls_id = box[6] print('%s: %f' % (_class_names[cls_id], cls_conf)) classes = len(_class_names) offset = cls_id * 123457 % classes red = get_color(2, offset, classes) green = get_color(1, offset, classes) blue = get_color(0, offset, classes) rgb = (red, green, blue) # draw.text((x1, y1), _class_names[cls_id], fill=rgb, font=font) draw.text((x1, y1), _class_names[cls_id], fill=rgb, font=font) draw.rectangle([x1, y1, x2, y2], outline=rgb, width=5) if _savename: print("save plot results to %s" % _savename) _img.save(_savename) return _img def read_truths(lab_path): if not os.path.exists(lab_path): return np.array([]) if os.path.getsize(lab_path): truths = np.loadtxt(lab_path) truths = truths.reshape(truths.size / 5, 5) # to avoid single truth problem return truths else: return np.array([]) def load_class_names(_namesfile): class_names = [] with open(_namesfile, 'r') as fp: lines = fp.readlines() for line in lines: line = line.rstrip() class_names.append(line) return class_names def do_detect(_model, _img, _cfg, _use_cuda=1): _model.eval() t0 = time.time() if isinstance(_img, Image.Image): width = _img.width height = _img.height img = torch.ByteTensor(torch.ByteStorage.from_buffer(_img.tobytes())) img = img.view(height, width, 3).transpose(0, 1).transpose(0, 2).contiguous() img = img.view(1, 3, height, width) img = img.float().div(255.0) elif type(_img) == np.ndarray and len(_img.shape) == 3: # cv2 image img = torch.from_numpy(_img.transpose(2, 0, 1)).float().div(255.0).unsqueeze(0) elif type(_img) == np.ndarray and len(_img.shape) == 4: img = torch.from_numpy(_img.transpose(0, 3, 1, 2)).float().div(255.0) else: print("unknow image type") exit(-1) t1 = time.time() if _use_cuda: img = img.cuda() img = torch.autograd.Variable(img) t2 = time.time() list_features = _model(img) list_features_numpy = [] for feature in list_features: list_features_numpy.append(feature.data.cup().numpy()) return post_processing(_img=img, _cfg=_cfg, _list_features_numpy=list_features_numpy, _t0=t0, _t1=t1, _t2=t2) def post_processing(_img, _cfg, _list_features_numpy, _t0, _t1, _t2): anchor_step = len(_cfg.anchors) // _cfg.num_anchors boxes = [] for i in range(3): masked_anchors = [] for m in _cfg.anchor_masks[i]: masked_anchors += _cfg.anchors[m * anchor_step:(m + 1) * anchor_step] masked_anchors = [anchor / _cfg.strides[i] for anchor in masked_anchors] boxes.append(get_region_boxes_out_model(_output=_list_features_numpy[i], _cfg=_cfg, _anchors=masked_anchors, _num_anchors=len(_cfg.anchor_masks[i]))) if _img.shape[0] > 1: bboxs_for_imgs = [ boxes[0][index] + boxes[1][index] + boxes[2][index] for index in range(_img.shape[0])] # 分别对每一张图片的结果进行nms t3 = time.time() boxes = [nms(_boxes=bboxs, _nms_thresh=_cfg.nms_thresh) for bboxs in bboxs_for_imgs] else: boxes = boxes[0][0] + boxes[1][0] + boxes[2][0] t3 = time.time() boxes = nms(boxes, _cfg.nms_thresh) t4 = time.time() if True: print('-----------------------------------') print(' image to tensor : %f' % (_t1 - _t0)) print(' tensor to cuda : %f' % (_t2 - _t1)) print(' predict : %f' % (t3 - _t2)) print(' nms : %f' % (t4 - t3)) print(' total : %f' % (t4 - _t0)) print('-----------------------------------') return boxes def classtxt_processing(_img, _cfg, _list_features_numpy, _t0, _t1, _t2): anchor_step = len(_cfg.anchors) // _cfg.num_anchors boxes = [] for i in range(3): masked_anchors = [] for m in _cfg.anchor_masks[i]: masked_anchors += _cfg.anchors[m * anchor_step:(m + 1) * anchor_step] masked_anchors = [anchor / _cfg.strides[i] for anchor in masked_anchors] boxes.append(get_classtxt_out_model(_output=_list_features_numpy[i], _cfg=_cfg, _anchors=masked_anchors, _num_anchors=len(_cfg.anchor_masks[i]))) if _img.shape[0] > 1: bboxs_for_imgs = [ boxes[0][index] + boxes[1][index] + boxes[2][index] for index in range(_img.shape[0])] # 分别对每一张图片的结果进行nms t3 = time.time() boxes = [nms(_boxes=bboxs, _nms_thresh=_cfg.nms_thresh) for bboxs in bboxs_for_imgs] else: boxes = boxes[0][0] + boxes[1][0] + boxes[2][0] t3 = time.time() boxes = nms(boxes, _cfg.nms_thresh) t4 = time.time() if True: print('-----------------------------------') print(' image to tensor : %f' % (_t1 - _t0)) print(' tensor to cuda : %f' % (_t2 - _t1)) print(' predict : %f' % (t3 - _t2)) print(' nms : %f' % (t4 - t3)) print(' total : %f' % (t4 - _t0)) print('-----------------------------------') return boxes def gen_cls_txt(_model, _img, _cfg, _use_cuda): _model.eval() t0 = time.time() if isinstance(_img, Image.Image): width = _img.width height = _img.height img = torch.ByteTensor(torch.ByteStorage.from_buffer(_img.tobytes())) img = img.view(height, width, 3).transpose(0, 1).transpose(0, 2).contiguous() img = img.view(1, 3, height, width) img = img.float().div(255.0) elif type(_img) == np.ndarray and len(_img.shape) == 3: # cv2 image img = torch.from_numpy(_img.transpose(2, 0, 1)).float().div(255.0).unsqueeze(0) elif type(_img) == np.ndarray and len(_img.shape) == 4: img = torch.from_numpy(_img.transpose(0, 3, 1, 2)).float().div(255.0) else: print("unknow image type") exit(-1) t1 = time.time() if _use_cuda: img = img.cuda() img = torch.autograd.Variable(img) t2 = time.time() list_features = _model(img) list_features_numpy = [] for feature in list_features: list_features_numpy.append(feature.data.cpu().numpy()) return classtxt_processing(_img=img, _cfg=_cfg, _list_features_numpy=list_features_numpy, _t0=t0, _t1=t1, _t2=t2)
yolov5_eval.py :
# -*- coding: utf-8 -*- # -------------------------------------------------------- # Fast/er R-CNN # Licensed under The MIT License [see LICENSE for details] # Written by Bharath Hariharan # -------------------------------------------------------- import xml.etree.ElementTree as ET import os import pickle import numpy as np def parse_rec(filename): """ Parse a PASCAL VOC xml file """ tree = ET.parse(filename) objects = [] for obj in tree.findall('object'): obj_struct = {} obj_struct['name'] = (obj.find('name').text).replace(" ", "") obj_struct['pose'] = obj.find('pose').text obj_struct['truncated'] = int(obj.find('truncated').text) obj_struct['difficult'] = int(obj.find('difficult').text) bbox = obj.find('bndbox') obj_struct['bbox'] = [int(bbox.find('xmin').text), int(bbox.find('ymin').text), int(bbox.find('xmax').text), int(bbox.find('ymax').text)] objects.append(obj_struct) return objects def voc_ap(rec, prec, use_07_metric=False): # voc2007的计算方式和voc2012的计算方式不同,目前一般采用第二种 """ ap = voc_ap(rec, prec, [use_07_metric]) Compute VOC AP given precision and recall. If use_07_metric is true, uses the VOC 07 11 point method (default:False). if use_07_metric: # 11 point metric ap = 0. for t in np.arange(0., 1.1, 0.1): if np.sum(rec >= t) == 0: p = 0 else: p = np.max(prec[rec >= t]) ap = ap + p / 11. else: # correct AP calculation # first append sentinel values at the end mrec = np.concatenate(([0.], rec, [1.])) mpre = np.concatenate(([0.], prec, [0.])) # compute the precision envelope for i in range(mpre.size - 1, 0, -1): mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i]) # to calculate area under PR curve, look for points # where X axis (recall) changes value i = np.where(mrec[1:] != mrec[:-1])[0] # and sum (\Delta recall) * prec ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1]) return ap ## 程序入口 def yolov5_eval(detpath, # 保存检测到的目标框的文件路径,每一类的目标框单独保存在一个文件 annopath, # Annotations的路径 imagesetfile, # 测试图片名字列表 classname, # 类别名称 cachedir, # 缓存文件夹 ovthresh=0.5, # IoU阈值 use_07_metric=False): # mAP计算方法 """rec, prec, ap = voc_eval(eval_classtxt_path, annopath, imagesetfile, classname, [ovthresh], [use_07_metric]) Top level function that does the PASCAL VOC evaluation. eval_classtxt_path: Path to detections eval_classtxt_path.format(classname) should produce the detection results file. annopath: Path to annotations annopath.format(imagename) should be the xml annotations file. imagesetfile: Text file containing the list of images, one image per line. classname: Category name (duh) cachedir: Directory for caching the annotations [ovthresh]: Overlap threshold (default = 0.5) [use_07_metric]: Whether to use VOC07's 11 point AP computation (default False) # assumes detections are in eval_classtxt_path.format(classname) # assumes annotations are in annopath.format(imagename) # assumes imagesetfile is a text file with each line an image name # cachedir caches the annotations in a pickle file # first load gt 获取真实目标框 # 当程序第一次运行时,会读取Annotations下的xml文件获取每张图片中真实的目标框 # 然后把获取的结果保存在annotations_cache文件夹中 # 以后再次运行时直接从缓存文件夹中读取真实目标 if not os.path.isdir(cachedir): os.mkdir(cachedir) cachefile = os.path.join(cachedir, 'annots.pkl') # read list of images with open(imagesetfile, 'r') as f: lines = f.readlines() imagenames = [x.strip() for x in lines] if not os.path.isfile(cachefile): # load annots recs = {} for i, imagename in enumerate(imagenames): recs[imagename] = parse_rec(annopath.format(imagename)) if i % 100 == 0: print('Reading annotation for {:d}/{:d}'.format(i + 1, len(imagenames))) # save print('Saving cached annotations to {:s}'.format(cachefile)) # with open(cachefile, 'w') as cls: # pickle.dump(recs, cls) with open(cachefile, 'wb') as f: pickle.dump(recs, f) else: # load with open(cachefile, 'rb') as f: recs = pickle.load(f) # extract gt objects for this class 提取该类的真实目标 class_recs = {} npos = 0 # 保存该类一共有多少真实目标 for imagename in imagenames: R = [obj for obj in recs[imagename] if obj['name'] == classname] # 保存名字为imagename的图片中,类别为classname的目标框的信息 bbox = np.array([x['bbox'] for x in R]) # 目标框的坐标 difficult = np.array([x['difficult'] for x in R]).astype(np.bool) # 是否是难以识别的目标 det = [False] * len(R) # 每一个目标框对应一个det[i],用来判断该目标框是否已经处理过 npos = npos + sum(~difficult) # 计算总的目标个数 class_recs[imagename] = {'bbox': bbox, # 把每一张图像中的目标框信息放到class_recs中 'difficult': difficult, 'det': det} # read dets detfile = detpath.format(classname) # 打开classname类别检测到的目标框文件 with open(detfile, 'r') as f: lines = f.readlines() splitlines = [x.strip().split(' ') for x in lines] image_ids = [x[0] for x in splitlines] # 图像名字 confidence = np.array([float(x[1]) for x in splitlines]) # 置信度 BB = np.array([[float(z) for z in x[2:]] for x in splitlines]) # 目标框坐标 # sort by confidence 按照置信度排序 sorted_ind = np.argsort(-confidence) sorted_scores = np.sort(-confidence) BB = BB[sorted_ind, :] image_ids = [image_ids[x] for x in sorted_ind] # go down dets and mark TPs and FPs nd = len(image_ids) # 统计检测到的目标框个数 tp = np.zeros(nd) # 创建tp列表,列表长度为目标框个数 fp = np.zeros(nd) # 创建fp列表,列表长度为目标框个数 for d in range(nd): R = class_recs[image_ids[d]] # 得到图像名字为image_ids[d]真实的目标框信息 bb = BB[d, :].astype(float) # 得到图像名字为image_ids[d]检测的目标框坐标 ovmax = -np.inf BBGT = R['bbox'].astype(float) # 得到图像名字为image_ids[d]真实的目标框坐标 if BBGT.size > 0: # compute overlaps 计算IoU # intersection ixmin = np.maximum(BBGT[:, 0], bb[0]) iymin = np.maximum(BBGT[:, 1], bb[1]) ixmax = np.minimum(BBGT[:, 2], bb[2]) iymax = np.minimum(BBGT[:, 3], bb[3]) iw = np.maximum(ixmax - ixmin + 1., 0.) ih = np.maximum(iymax - iymin + 1., 0.) inters = iw * ih # union uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) + (BBGT[:, 2] - BBGT[:, 0] + 1.) * (BBGT[:, 3] - BBGT[:, 1] + 1.) - inters) overlaps = inters / uni ovmax = np.max(overlaps) # 检测到的目标框可能预若干个真实目标框都有交集,选择其中交集最大的 jmax = np.argmax(overlaps) if ovmax > ovthresh: # IoU是否大于阈值 if not R['difficult'][jmax]: # 真实目标框是否难以识别 if not R['det'][jmax]: # 该真实目标框是否已经统计过 tp[d] = 1. # 将tp对应第d个位置变成1 R['det'][jmax] = 1 # 将该真实目标框做标记 else: fp[d] = 1. # 否则将fp对应的位置变为1 else: fp[d] = 1. # 否则将fp对应的位置变为1 # compute precision recall fp = np.cumsum(fp) # 按列累加,最大值即为tp数量 tp = np.cumsum(tp) # 按列累加,最大值即为fp数量 rec = tp / float(npos) # 计算recall # avoid divide by zero in case the first detection matches a difficult # ground truth prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps) # 计算精度 ap = voc_ap(rec, prec, use_07_metric) # 计算ap return rec, prec, ap
首先运行detect_eval_class_txt.py
在yolov5\data_test\predictions_manual文件夹中保存了测试结果图片,在imgs_name_manual.txt文件中记录测试图片名称,在yolov5\data_test\class_txt_manual文件夹中保存了每一类的结果信息。然后运行compute_mAP.py
输出各类的AP、recall、precision 以及 mAP."D:\Program Files\Python38\python.exe" E:/tpz/yolov5/mAP/compute_mAP.py Reading annotation for 1/712 Reading annotation for 101/712 Reading annotation for 201/712 Reading annotation for 301/712 Reading annotation for 401/712 Reading annotation for 501/712 Reading annotation for 601/712 Reading annotation for 701/712 Saving cached annotations to ../data_test/cachedir_manual/annots.pkl AP for combustion_lining = 0.9992 recall for combustion_lining = 1.0000 precision for combustion_lining = 0.9951 AP for fan = 0.9968 recall for fan = 0.9968 precision for fan = 1.0000 AP for fan_stator_casing_and_support = 0.9995 recall for fan_stator_casing_and_support = 1.0000 precision for fan_stator_casing_and_support = 0.9918 AP for hpc_spool = 1.0000 recall for hpc_spool = 1.0000 precision for hpc_spool = 0.9950 AP for hpc_stage_5 = 0.9967 recall for hpc_stage_5 = 0.9967 precision for hpc_stage_5 = 0.9918 AP for hp_core_casing = 0.9951 recall for hp_core_casing = 0.9967 precision for hp_core_casing = 0.9870 AP for mixer = 0.9992 recall for mixer = 1.0000 precision for mixer = 0.9967 AP for nozzle = 0.9953 recall for nozzle = 0.9953 precision for nozzle = 0.9953 AP for nozzle_cone = 0.9984 recall for nozzle_cone = 0.9984 precision for nozzle_cone = 0.9967 AP for stand = 1.0000 recall for stand = 1.0000 precision for stand = 0.9985 Mean AP = 0.9980 ~~~~~~~~ Results: 0.999 0.997 0.999 1.000 0.997 0.995 0.999 0.995 0.998 1.000 ~~~~~~~~ 0.998 ~~~~~~~~
运行mAP_line.py,可以绘制AP曲线。绘制其他曲线的代码类似。
目录参考文献代码和权重下载准备工作data中新建几个文件夹makeTxt.pyvoc_label.py文件修改数据集方面的yaml文件修改网络参数方面的yaml文件修改train.py中的一些参数修改训练测试新建data_test新建几个py文件cfg_mAP.py:detect_eval_class_txt.py :compute_mAP.py:mAP_line.py :utils_mAP.py:yolov5_eval.py :本人前段时间用yolov5进行目标检测研究,记录一下流程方便查看,也希望能帮 或者使用命令行 python train.py --cfg yolov5l,yaml --batch-size 64 --weights './runs/train/exp4/weight/last.pt'
有时候i利用yolov5进行检测的时候,除了指定的分类外,还会将一些干扰项整理为一些类,以减少yolov5的检测误判。但在评估网路性能时,因为干扰项是标签以外的类,所以计算出的mAP会比实际上的低一些。 解决方案: 很简单,就是删除掉检测为干扰项的部分结果,但难点在于如何删除。 yolov5官方的评估脚本:val.py 我们只需要在该代码内run()函数中增加几行代码即可: 添加位置:在检测框经历NMS去除重复框后,即: # Run NMS targets[:, 2:] *=在 YOLOv5 训练过程中,验证集用于评估模型在未见过的数据上的性能表现。下面是使用验证集进行训练的一般步骤: 1. 准备验证集数据:与训练集类似,您需要准备一个包含验证图像和相应标签的数据集。确保验证集数据与训练集数据具有相同的目录结构和标注格式。 2. 配置训练参数:在 YOLOv5 的配置文件中,您可以找到用于指定训练参数和超参数的部分。请确保以下参数设置正确: - `val: [path]`:指定验证集数据的路径,可以是单个文件、文件夹或包含多个文件夹的列表。 - `val_interval`:设置多少个训练批次后进行一次验证。例如,`val_interval=10` 表示每隔 10 个批次进行一次验证。 3. 开始训练:使用命令行或脚本启动 YOLOv5 的训练过程。训练过程中,模型将会周期性地在验证集上进行评估。 4. 查看验证结果:在每次验证后,您可以在训练日志中查看模型在验证集上的性能指标,如 mAP(平均精度均值)。您还可以使用可视化工具来查看模型在验证集上的预测结果。 请注意,确保验证集数据与训练集数据是相互独立的,避免数据泄漏。同时,验证集应该具有足够的样本以确保对模型的性能进行准确评估。 希望这些步骤对您有所帮助!如果您有其他问题,请随时提问。