赞
踩
七种表情识别是一个多学科交叉的研究领域,它结合了心理学、认知科学、计算机视觉和机器学习等学科的知识和技术。
数据集内容:RAF-DB数据集是一个大规模面部表情数据库,由315名工作人员(大学的学生和教职员工)对表情进行标注。
在对表情的选择上,从一系列表情(例如:微笑,咯咯笑声,哭泣,愤怒,害怕,害怕,恐惧,震惊,惊讶,厌恶,无表情)中,挑选出六种基本情感以及中立情感,一共7种表情进行表情标注。
数据集数量:RAF-DB数据集,包含大约3万张面部图像。除了表情标注外,对每个人脸还有5个特征点标注,人脸边界框,种族,年龄范围和性别等属性的标注。
数据集功能:表情识别、人脸检测、年龄估计
下载链接:http://www.whdeng.cn/RAF/model1.html
YOLOv8是YOLO系列的最新迭代产品,它在目标检测领域带来了一系列创新和改进。以下是YOLOv8的一些关键特点和功能:
模型结构:YOLOv8采用了新的SOTA模型,包括不同分辨率的目标检测网络和基于YOLACT的实例分割模型。它在骨干网络和Neck部分可能参考了YOLOv7 ELAN的设计思想,将YOLOv5的C3结构换成了C2f结构,并对不同尺度模型调整了不同的通道数。
Head部分:与YOLOv5相比,YOLOv8的Head部分改动较大,换成了目前主流的解耦头结构,将分类和检测头分离,并且从Anchor-Based换成了Anchor-Free。
Loss计算:YOLOv8采用了TaskAlignedAssigner正样本分配策略,并引入了Distribution Focal Loss,这有助于提高检测过程的准确性和效率。
数据增强:在训练过程中,YOLOv8引入了YOLOX中的最后10个epoch关闭Mosaic增强的操作,这可以有效地提升精度。
训练策略:YOLOv8的模型训练总epoch数从300提升到了500,这导致训练时间增加,但可能有助于进一步提升模型性能。
性能和速度:YOLOv8专注于保持精度与速度之间的最佳平衡,适用于各种应用领域的实时目标检测任务。
预训练模型:YOLOv8提供一系列预训练模型,以满足各种任务和性能要求,从而更容易为您的特定用例找到合适的模型。
支持的任务和模式:YOLOv8系列提供多种模型,每种模型都专门用于计算机视觉中的特定任务,如物体检测、实例分割、姿态/关键点检测等。
性能基准测试:YOLOv8可以进行性能基准测试,以评估在不同导出格式下的速度和准确性。
实战应用:有教程提供了YOLOv8在LabVIEW中的部署,包括模型的导出、图片和视频推理的实现。
安装环境:
conda create -n yolov8 python=3.8
activate ylolv8
pip install ultralytics
安装完成之后,分割数据进行模型训练。训练完之后把模型转成onnx,使用以下命令将YOLO模型从PyTorch导出为ONNX格式,并设置opset为12:
yolo export model=yolov8s.pt format=onnx dynamic=False opset=12
命令的含义解释如下:
yolo export: 使用YOLO导出功能、
model=yolov8s.pt: 指定PyTorch模型的路径
format=onnx: 导出为ONNX格式
dynamic=False: 关闭动态输入
opset=12: 设置ONNX模型的opset版本为12
得到onnx模型之后要转成ncnn的模型,可以使用onnx2ncnn.exe进行模型转换,也可以使用在线的模型转换工具。
从官方下载以编译好的ncnn库,我这里使用的IDE是vs2022,下载对应自己的库。导入lib和include,如果想要指用GPU进行推理,则要编译vulkan库。
#include "FacialEmotion.h" static float fast_exp(float x) { union { uint32_t i; float f; } v{}; v.i = (1 << 23) * (1.4426950409 * x + 126.93490512f); return v.f; } static inline float sigmoid(float x) { return 1.0f / (1.0f + fast_exp(-x)); } static float intersection_area(const Object& a, const Object& b) { cv::Rect_<float> inter = a.rect & b.rect; return inter.area(); } static void qsort_descent_inplace(std::vector<Object>& faceobjects, int left, int right) { int i = left; int j = right; float p = faceobjects[(left + right) / 2].prob; while (i <= j) { while (faceobjects[i].prob > p) i++; while (faceobjects[j].prob < p) j--; if (i <= j) { // swap std::swap(faceobjects[i], faceobjects[j]); i++; j--; } } // #pragma omp parallel sections { // #pragma omp section { if (left < j) qsort_descent_inplace(faceobjects, left, j); } // #pragma omp section { if (i < right) qsort_descent_inplace(faceobjects, i, right); } } } static void qsort_descent_inplace(std::vector<Object>& faceobjects) { if (faceobjects.empty()) return; qsort_descent_inplace(faceobjects, 0, faceobjects.size() - 1); } static void nms_sorted_bboxes(const std::vector<Object>& faceobjects, std::vector<int>& picked, float nms_threshold) { picked.clear(); const int n = faceobjects.size(); std::vector<float> areas(n); for (int i = 0; i < n; i++) { areas[i] = faceobjects[i].rect.width * faceobjects[i].rect.height; } for (int i = 0; i < n; i++) { const Object& a = faceobjects[i]; int keep = 1; for (int j = 0; j < (int)picked.size(); j++) { const Object& b = faceobjects[picked[j]]; // intersection over union float inter_area = intersection_area(a, b); float union_area = areas[i] + areas[picked[j]] - inter_area; // float IoU = inter_area / union_area if (inter_area / union_area > nms_threshold) keep = 0; } if (keep) picked.push_back(i); } } static void generate_grids_and_stride(const int target_w, const int target_h, std::vector<int>& strides, std::vector<GridAndStride>& grid_strides) { for (int i = 0; i < (int)strides.size(); i++) { int stride = strides[i]; int num_grid_w = target_w / stride; int num_grid_h = target_h / stride; for (int g1 = 0; g1 < num_grid_h; g1++) { for (int g0 = 0; g0 < num_grid_w; g0++) { GridAndStride gs; gs.grid0 = g0; gs.grid1 = g1; gs.stride = stride; grid_strides.push_back(gs); } } } } static void generate_proposals(std::vector<GridAndStride> grid_strides, const ncnn::Mat& pred, float prob_threshold, std::vector<Object>& objects) { const int num_points = grid_strides.size(); const int num_class = 7; const int reg_max_1 = 16; for (int i = 0; i < num_points; i++) //out.h { const float* scores = pred.row(i) + 4 * reg_max_1; // find label with max score int label = -1; float score = -FLT_MAX; for (int k = 0; k < num_class; k++) { float confidence = scores[k]; if (confidence > score) { label = k; score = confidence; } } float box_prob = sigmoid(score); if (box_prob >= prob_threshold) { ncnn::Mat bbox_pred(reg_max_1, 4, (void*)pred.row(i)); { ncnn::Layer* softmax = ncnn::create_layer(ncnn::layer_to_index("Softmax")); // ncnn::layer_to_index("Softmax") ncnn::ParamDict pd; pd.set(0, 1); // axis // pd.set(1, 1); softmax->load_param(pd); ncnn::Option opt; opt.num_threads = 1; opt.use_packing_layout = false; softmax->create_pipeline(opt); softmax->forward_inplace(bbox_pred, opt); softmax->destroy_pipeline(opt); delete softmax; } float pred_ltrb[4]; for (int k = 0; k < 4; k++) { float dis = 0.f; const float* dis_after_sm = bbox_pred.row(k); for (int l = 0; l < reg_max_1; l++) { dis += l * dis_after_sm[l]; } pred_ltrb[k] = dis * grid_strides[i].stride; } float pb_cx = (grid_strides[i].grid0 + 0.5f) * grid_strides[i].stride; float pb_cy = (grid_strides[i].grid1 + 0.5f) * grid_strides[i].stride; float x0 = pb_cx - pred_ltrb[0]; float y0 = pb_cy - pred_ltrb[1]; float x1 = pb_cx + pred_ltrb[2]; float y1 = pb_cy + pred_ltrb[3]; Object obj; obj.rect.x = x0; obj.rect.y = y0; obj.rect.width = x1 - x0; obj.rect.height = y1 - y0; obj.label = label; obj.prob = box_prob; objects.push_back(obj); } } } //调用ncnn转置操作 static void transpose(const ncnn::Mat& in, ncnn::Mat& out) { ncnn::Option opt; opt.num_threads = 1; opt.use_fp16_storage = false; opt.use_packing_layout = true; ncnn::Layer* op = ncnn::create_layer("Permute"); // set param ncnn::ParamDict pd; pd.set(0, 1);// order_type=1 op->load_param(pd); op->create_pipeline(opt); op->forward(in,out, opt); op->destroy_pipeline(opt); delete op; } FacialEmotion::FacialEmotion() { blob_pool_allocator.set_size_compare_ratio(0.f); workspace_pool_allocator.set_size_compare_ratio(0.f); } int FacialEmotion::load(std::string parma_path,std::string bin_path,int _target_size,bool use_gpu) { yolo.clear(); blob_pool_allocator.clear(); workspace_pool_allocator.clear(); ncnn::set_cpu_powersave(2); ncnn::set_omp_num_threads(ncnn::get_big_cpu_count()); yolo.opt = ncnn::Option(); #if NCNN_VULKAN yolo.opt.use_vulkan_compute = use_gpu; #endif yolo.opt.num_threads = 2; yolo.opt.blob_allocator = &blob_pool_allocator; yolo.opt.workspace_allocator = &workspace_pool_allocator; yolo.load_param(parma_path.c_str()); yolo.load_model(bin_path.c_str()); target_size = _target_size; return 0; } int FacialEmotion::detect(const cv::Mat& rgb, std::vector<Object>& objects, float prob_threshold, float nms_threshold) { int width = rgb.cols; int height = rgb.rows; // pad to multiple of 32 int w = width; int h = height; float scale = 1.f; if (w > h) { scale = (float)target_size / w; w = target_size; h = h * scale; } else { scale = (float)target_size / h; h = target_size; w = w * scale; } ncnn::Mat in = ncnn::Mat::from_pixels_resize(rgb.data, ncnn::Mat::PIXEL_BGR, width, height, w, h); // pad to target_size rectangle int wpad = (w + 31) / 32 * 32 - w; int hpad = (h + 31) / 32 * 32 - h; ncnn::Mat in_pad; ncnn::copy_make_border(in, in_pad, hpad / 2, hpad - hpad / 2, wpad / 2, wpad - wpad / 2, ncnn::BORDER_CONSTANT, 0.f); in_pad.substract_mean_normalize(0, norm_vals); ncnn::Extractor ex = yolo.create_extractor(); ex.input("images", in_pad); std::vector<Object> proposals; ncnn::Mat out; ex.extract("/model.22/Concat_3_output_0", out); ncnn::Mat out1; transpose(out, out1); std::vector<int> strides = {8, 16, 32}; // might have stride=64 std::vector<GridAndStride> grid_strides; generate_grids_and_stride(in_pad.w, in_pad.h, strides, grid_strides); generate_proposals(grid_strides, out1, prob_threshold, proposals); qsort_descent_inplace(proposals); // apply nms with nms_threshold std::vector<int> picked; nms_sorted_bboxes(proposals, picked, nms_threshold); int count = picked.size(); objects.resize(count); for (int i = 0; i < count; i++) { objects[i] = proposals[picked[i]]; // adjust offset to original unpadded float x0 = (objects[i].rect.x - (wpad / 2)) / scale; float y0 = (objects[i].rect.y - (hpad / 2)) / scale; float x1 = (objects[i].rect.x + objects[i].rect.width - (wpad / 2)) / scale; float y1 = (objects[i].rect.y + objects[i].rect.height - (hpad / 2)) / scale; // clip x0 = std::max(std::min(x0, (float)(width - 1)), 0.f); y0 = std::max(std::min(y0, (float)(height - 1)), 0.f); x1 = std::max(std::min(x1, (float)(width - 1)), 0.f); y1 = std::max(std::min(y1, (float)(height - 1)), 0.f); objects[i].rect.x = x0; objects[i].rect.y = y0; objects[i].rect.width = x1 - x0; objects[i].rect.height = y1 - y0; } // sort objects by area struct { bool operator()(const Object& a, const Object& b) const { return a.rect.area() > b.rect.area(); } } objects_area_greater; std::sort(objects.begin(), objects.end(), objects_area_greater); return 0; } int FacialEmotion::draw(cv::Mat& rgb, const std::vector<Object>& objects) { static const char *class_names[] = { "surprise", "fear","disgust","happiness","sadness","anger","neutral" }; static const unsigned char colors[19][3] = { { 54, 67, 244}, { 99, 30, 233}, {176, 39, 156}, {183, 58, 103}, {181, 81, 63}, {243, 150, 33}, {244, 169, 3}, {212, 188, 0}, {136, 150, 0}, { 80, 175, 76}, { 74, 195, 139}, { 57, 220, 205}, { 59, 235, 255}, { 7, 193, 255}, { 0, 152, 255}, { 34, 87, 255}, { 72, 85, 121}, {158, 158, 158}, {139, 125, 96} }; int color_index = 0; for (size_t i = 0; i < objects.size(); i++) { const Object& obj = objects[i]; const unsigned char* color = colors[color_index % 19]; color_index++; cv::Scalar cc(color[0], color[1], color[2]); cv::rectangle(rgb, obj.rect, cc, 2); char text[256]; sprintf(text, "%s %.1f%%", class_names[obj.label], obj.prob * 100); int baseLine = 0; cv::Size label_size = cv::getTextSize(text, cv::FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine); int x = obj.rect.x; int y = obj.rect.y - label_size.height - baseLine; if (y < 0) y = 0; if (x + label_size.width > rgb.cols) x = rgb.cols - label_size.width; cv::rectangle(rgb, cv::Rect(cv::Point(x, y), cv::Size(label_size.width, label_size.height + baseLine)), cc, -1); cv::Scalar textcc = (color[0] + color[1] + color[2] >= 381) ? cv::Scalar(0, 0, 0) : cv::Scalar(255, 255, 255); cv::putText(rgb, text, cv::Point(x, y + label_size.height), cv::FONT_HERSHEY_SIMPLEX, 0.5, textcc, 1); } return 0; }
实现的效果如下:
源码地址:https://download.csdn.net/download/matt45m/89612957
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。