这是在 Jetson TX2 上弄的。
这里是deepstream5的地址。按照其所给的方法直接环境构建就好了,但是服务器图像化环境有问题,就是用官方提供的docker,下面大致讲下docker中demo的实现(这里都是针对服务器说的,不是jetson):
一般来说,就是OK的了,但是服务器环境有点问题,可能会有这样一个错误==cuGraphicsGLRegisterBuffer failed with error(219) gst_eglglessink_cuda_init texture = 1==,根据官方解答应该是在装driver时没有装nvidia opengl,这里是回答地址。
所以是不能够进行展示了,要看效果的话,就进到.txt配置文件,把[sink0]下改成enable=0,同时把[sink1]下改成enable=1,然后再执行就会得到一个名为==out.mp4==的结果视频,也代表运行成功了。这里是配置参数的说明。
docker 镜像的一个说明:
==多路摄像头提到了这个:triton==
github的地址:triton-inference-server。
去nvidia-NGC(这里有很多英伟达相关联的container)中搜索Triton Inference Server下载 # 镜像上面去找。
要注意triton镜像与cuda的版本对应,对应关系看这里。 # 一定先看这个
两种:(推荐直接从第二种入手吧,环境不容易出错,特别是tensort的部署)
运行这个docker,会自动映射8000、8001、8002这几个端口,如果有防火墙,也会把这几个端口开启,局域网内的其它机器可以直解连接访问,不用去设置或转发了,然后这个docker停止运行后,就会把这几个端口给关闭掉
下面是安装它的快速启动来的。(它示例demo是tensorflow和onnx)
然后根据这个官方的README,把它自带的demo的模型下好(根据它的fetch_models.sh去注意存放的路径和模型最后的名字),(里面一个是tensorflow的inception模型一个是densenet_onnx的onn模型)
然后跑它的简单demo,把容器起起来:(可能还是会报错,按照提示,要加 –shm-size=1g –ulimit memlock=-1 –ulimit stack=67108864 这几个参数)
docker run –gpus all –rm –shm-size=1g –ulimit memlock=-1 –ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v $(pwd)/model_repository:/models nvcr.io/nvidia/tritonserver:20.12-py3 tritonserver –model-repository=/models
然后是获取客户端示例:
CentOS-7.9.x86_64-gnu、tritonserver:21.10-py3、tensorrt:21.10-py3是成功了的。
最开始用自己在宿主机编译的王鑫宇的tensort的yolov5的.engine,各种报错,库版本不对,然后是参考的yolov4-triton-tensorrt这个项目的构建方法,详细可以去读它的README,然后简单总结一下:
先根据最上面自己宿主机的cuda版本大概在这里选好用的triton的版本,这里比如是21.10-py3,那么先去拉取同版本的tensorrt镜像,然后在这个镜像里去编译.engine文件:
把上一步生成的两个文件放到一个文件夹,按triton文件夹要求来,一般会把.engine名字改为model.plan
,然后拉取triton的镜像(注意版本和triton一致):
docker pull nvcr.io/nvidia/tritonserver:21.10-py3
一切文件夹、模型都准备好了后,在合适的地方启动容器:
docker run --gpus all --rm --shm-size=1g --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v$(pwd)/model_repository:/model_repository -v$(pwd)/plugins:/plugins --env LD_PRELOAD=/plugins/libmyplugins.so nvcr.io/nvidia/tritonserver:21.10-py3 tritonserver --model-repository=/model_repository --strict-model-config=false --log-verbose 1
因为没有加项目要求的那种配置文件,根据github上的项目,docker容器启动时,加了这些参数:
注意:
因为这中方法一定要指定–env LD_PRELOAD,然后用github上的yolov4有一个.so文件,我的yolov5也有一个.so,没办法同时指定,所以对应的宿主机中的方模型的model_repository文件夹中,二者只能存其一,不然就运行不了。 我想要是把各个模型需要的config.pbtxt配置文件写好,应该就可以共存运行了,可是暂时还不会写。
然后客户端的编写就参考yolov4-triton-tensorrt这个的客户端写,然后一些内容可能对不上,就把它的项目的配置像下面这种打出来,然后参照着改。
因为我们自己没有写配置文件,上面设置了的参数可以帮我们自动生诚,然后就可以在服务启动时看log里有生成的格式,就可以照着复制下来,尝试自己改(应该可行,还没试过,有时候模型过多,可能终端就覆盖了,记得保存下来查找,这里放一个我的tensort_yolov5l的生成的)
最后客户端的速度咋不大行呢,yolov5这种在
I0715 05:51:22.080915 1 tensorrt.cc:462] post auto-complete:
{
"name": "yolov5l",
"platform": "tensorrt_plan",
"backend": "tensorrt",
"version_policy": {
"latest": {
"num_versions": 1
}
},
"max_batch_size": 1,
"input": [
{
"name": "data", // input这里大抵要对照着改成data
"data_type": "TYPE_FP32",
"dims": [
3,
1024,
1024
],
"is_shape_tensor": false
}
],
"output": [
{
"name": "prob", // output这里大抵要对照着改成prob
"data_type": "TYPE_FP32",
"dims": [
6001,
1,
1
],
"is_shape_tensor": false
}
],
"batch_input": [],
"batch_output": [],
"optimization": {
"priority": "PRIORITY_DEFAULT",
"input_pinned_memory": {
"enable": true
},
"output_pinned_memory": {
"enable": true
},
"gather_kernel_buffer_threshold": 0,
"eager_batching": false
},
"instance_group": [
{
"name": "yolov5l",
"kind": "KIND_GPU",
"count": 1,
"gpus": [
0,
1
],
"secondary_devices": [],
"profile": [],
"passive": false,
"host_policy": ""
}
],
"default_model_filename": "model.plan",
"cc_model_filenames": {},
"metric_tags": {},
"parameters": {},
"model_warmup": []
}
I0715 05:51:22.082008 1 tensorrt.cc:426] model configuration:
{
"name": "yolov5l",
"platform": "tensorrt_plan",
"backend": "tensorrt",
"version_policy": {
"latest": {
"num_versions": 1
}
},
"max_batch_size": 1,
"input": [
{
"name": "data",
"data_type": "TYPE_FP32",
"dims": [
3,
1024,
1024
],
"is_shape_tensor": false
}
],
"output": [
{
"name": "prob",
"data_type": "TYPE_FP32",
"dims": [
6001,
1,
1
],
"is_shape_tensor": false
}
],
"batch_input": [],
"batch_output": [],
"optimization": {
"priority": "PRIORITY_DEFAULT",
"input_pinned_memory": {
"enable": true
},
"output_pinned_memory": {
"enable": true
},
"gather_kernel_buffer_threshold": 0,
"eager_batching": false
},
"instance_group": [
{
"name": "yolov5l",
"kind": "KIND_GPU",
"count": 1,
"gpus": [
0,
1
],
"secondary_devices": [],
"profile": [],
"passive": false,
"host_policy": ""
}
],
"default_model_filename": "model.plan",
"cc_model_filenames": {},
"metric_tags": {},
"parameters": {},
"model_warmup": []
}
这个是仿造github上项目yolov4-triton-tensoert的python的client写的,同时结合王鑫宇的项目的python推断代码一起改的:
#!/usr/bin/env python
import argparse
import numpy as np
import sys
import cv2
import time
import torch
import torchvision
import random
import tritonclient.grpc as grpcclient
from tritonclient.utils import InferenceServerException, triton_to_np_dtype
# import tritonclient.utils.shared_memory as shm # 共享内存这个模块win暂时用不了,相关代码都删了
CONF_THRESH = 0.5 # 执行度,nms的值,可写成配置文件
IOU_THRESHOLD = 0.4
categories = ['T01', 'T02_1', 'T02_2', 'T02_3', 'T03_1', 'T03_2', 'T03_3', 'T04', 'T05',
'T06', 'T07', 'T08', 'T09', 'T10', 'T11', 'T12', 'T13', 'T14', 'T15',
'T16', 'T17', 'T18', 'T19', 'T20', 'T21', 'T22', 'T26']
def preprocess_image(raw_bgr_image, input_w, input_h):
"""
description: Convert BGR image to RGB,
resize and pad it to target size, normalize to [0,1],
transform to NCHW format.
param:
input_image_path: str, image path
return:
image: the processed image
image_raw: the original image
h: original height
w: original width
"""
# input_w input_h 是tensorrt的输入长宽
image_raw = raw_bgr_image
h, w, c = image_raw.shape
image = cv2.cvtColor(image_raw, cv2.COLOR_BGR2RGB)
# Calculate widht and height and paddings
r_w = input_w / w
r_h = input_h / h
if r_h > r_w:
tw = input_w
th = int(r_w * h)
tx1 = tx2 = 0
ty1 = int((input_h - th) / 2)
ty2 = input_h - th - ty1
else:
tw = int(r_h * w)
th = input_h
tx1 = int((input_w - tw) / 2)
tx2 = input_w - tw - tx1
ty1 = ty2 = 0
# Resize the image with long side while maintaining ratio
image = cv2.resize(image, (tw, th))
# Pad the short side with (128,128,128)
image = cv2.copyMakeBorder(
image, ty1, ty2, tx1, tx2, cv2.BORDER_CONSTANT, (128, 128, 128)
)
image = image.astype(np.float32)
# Normalize to [0,1]
image /= 255.0
# HWC to CHW format:
image = np.transpose(image, [2, 0, 1])
# CHW to NCHW format
image = np.expand_dims(image, axis=0)
# Convert the image to row-major order, also known as "C order":
image = np.ascontiguousarray(image)
return image, image_raw, h, w
def post_process(output, origin_h, origin_w):
"""
description: postprocess the prediction
param:
output: A tensor likes [num_boxes,cx,cy,w,h,conf,cls_id, cx,cy,w,h,conf,cls_id, ...]
origin_h: height of original image
origin_w: width of original image
return:
result_boxes: finally boxes, a boxes tensor, each row is a box [x1, y1, x2, y2]
result_scores: finally scores, a tensor, each element is the score correspoing to box
result_classid: finally classid, a tensor, each element is the classid correspoing to box
"""
# Get the num of boxes detected
num = int(output[0])
# Reshape to a two dimentional ndarray
pred = np.reshape(output[1:], (-1, 6))[:num, :]
# to a torch Tensor
pred = torch.Tensor(pred.copy())
# Get the boxes
boxes = pred[:, :4]
# Get the scores
scores = pred[:, 4]
# Get the classid
classid = pred[:, 5]
# Choose those boxes that score > CONF_THRESH
si = scores > CONF_THRESH
boxes = boxes[si, :]
scores = scores[si]
classid = classid[si]
# Trandform bbox from [center_x, center_y, w, h] to [x1, y1, x2, y2]
boxes = xywh2xyxy(origin_h, origin_w, boxes)
# Do nms
indices = torchvision.ops.nms(boxes, scores, iou_threshold=IOU_THRESHOLD).cpu()
result_boxes = boxes[indices, :].cpu()
result_scores = scores[indices].cpu()
result_classid = classid[indices].cpu()
return result_boxes, result_scores, result_classid
input_w = 1024
input_h = 1024
def xywh2xyxy(origin_h, origin_w, x):
"""
description: Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
param:
origin_h: height of original image
origin_w: width of original image
x: A boxes tensor, each row is a box [center_x, center_y, w, h]
return:
y: A boxes tensor, each row is a box [x1, y1, x2, y2]
"""
y = torch.zeros_like(x) if isinstance(x, torch.Tensor) else np.zeros_like(x)
r_w = input_w / origin_w
r_h = input_h / origin_h
if r_h > r_w:
y[:, 0] = x[:, 0] - x[:, 2] / 2
y[:, 2] = x[:, 0] + x[:, 2] / 2
y[:, 1] = x[:, 1] - x[:, 3] / 2 - (input_h - r_w * origin_h) / 2
y[:, 3] = x[:, 1] + x[:, 3] / 2 - (input_h - r_w * origin_h) / 2
y /= r_w
else:
y[:, 0] = x[:, 0] - x[:, 2] / 2 - (input_w - r_h * origin_w) / 2
y[:, 2] = x[:, 0] + x[:, 2] / 2 - (input_w - r_h * origin_w) / 2
y[:, 1] = x[:, 1] - x[:, 3] / 2
y[:, 3] = x[:, 1] + x[:, 3] / 2
y /= r_h
return y
def plot_one_box(x, img, color=None, label=None, line_thickness=None):
"""
description: Plots one bounding box on image img,
this function comes from YoLov5 project.
param:
x: a box likes [x1,y1,x2,y2]
img: a opencv image object
color: color to draw rectangle, such as (0,255,0)
label: str
line_thickness: int
return:
no return
"""
tl = (
line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1
) # line/font thickness
color = color or [random.randint(0, 255) for _ in range(3)]
c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
if label:
tf = max(tl - 1, 1) # font thickness
t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA) # filled
cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
class TritonClient:
def __init__(self, parsers):
self.parser = parsers
print(self.parser)
self.triton_client = self.init_triton_clint()
def init_triton_clint(self):
FLAGS = self.parser
# Create server context
try:
triton_client = grpcclient.InferenceServerClient(
url=FLAGS.url,
verbose=FLAGS.verbose,
ssl=FLAGS.ssl,
root_certificates=FLAGS.root_certificates,
private_key=FLAGS.private_key,
certificate_chain=FLAGS.certificate_chain)
except Exception as e:
print("context creation failed: " + str(e))
sys.exit()
# Health check
if not triton_client.is_server_live():
print("FAILED : is_server_live")
sys.exit(1)
if not triton_client.is_server_ready():
print("FAILED : is_server_ready")
sys.exit(1)
if not triton_client.is_model_ready(FLAGS.model):
print("FAILED : is_model_ready")
sys.exit(1)
if FLAGS.model_info:
# Model metadata
try:
metadata = triton_client.get_model_metadata(FLAGS.model)
print(metadata)
except InferenceServerException as ex:
if "Request for unknown model" not in ex.message():
print("FAILED : get_model_metadata")
print("Got: {}".format(ex.message()))
sys.exit(1)
else:
print("FAILED : get_model_metadata")
sys.exit(1)
return triton_client
def run(self):
FLAGS = self.parser
inputs = []
outputs = []
inputs.append(grpcclient.InferInput('data', [1, 3, FLAGS.width, FLAGS.height], "FP32"))
outputs.append(grpcclient.InferRequestedOutput('prob')) # data、prob是自动生成的名字,因为我们没写配置文件
# DUMMY MODE
if FLAGS.mode == 'dummy':
inputs[0].set_data_from_numpy(np.ones(shape=(1, 3, FLAGS.width, FLAGS.height), dtype=np.float32))
results = self.triton_client.infer(model_name=FLAGS.model, inputs=inputs, outputs=outputs,
client_timeout=FLAGS.client_timeout)
if FLAGS.model_info:
statistics = self.triton_client.get_inference_statistics(model_name=FLAGS.model)
if len(statistics.model_stats) != 1:
print("FAILED: get_inference_statistics")
sys.exit(1)
print(statistics)
result = results.as_numpy('prob') # 这里也要改成 prob
print(f"Received result buffer of size {result.shape}")
# IMAGE MODE
if FLAGS.mode == 'image':
if not FLAGS.input:
print("FAILED: no input image")
sys.exit(1)
input_image = cv2.imread(str(FLAGS.input))
if input_image is None:
print(f"FAILED: could not load input image {str(FLAGS.input)}")
sys.exit(1)
# 预处理后的图像,原图,原图的h,原图的w
input_image_buffer, image_raw, origin_h, origin_w = preprocess_image(input_image, FLAGS.width, FLAGS.height)
inputs[0].set_data_from_numpy(input_image_buffer)
results = self.triton_client.infer(model_name=FLAGS.model, inputs=inputs, outputs=outputs,
client_timeout=FLAGS.client_timeout)
if FLAGS.model_info:
statistics = self.triton_client.get_inference_statistics(model_name=FLAGS.model)
if len(statistics.model_stats) != 1:
print("FAILED: get_inference_statistics")
sys.exit(1)
print(statistics)
result = results.as_numpy('prob')
print(f"Received result buffer of size {result.shape}")
# 下面这是仿照王鑫宇的代码改的,这里的0就是batch的第一个i,
result = result[0]
result_boxes, result_scores, result_classid = post_process(
result[0 * 6001: (0 + 1) * 6001], origin_h, origin_w
)
for i in range(len(result_boxes)):
box = result_boxes[i]
plot_one_box(box, image_raw, label="{}:{:.2f}".format(
categories[int(result_classid[i])], result_scores[i])
)
if FLAGS.out:
cv2.imwrite(FLAGS.out, image_raw)
print(f"Saved result to {FLAGS.out}")
else:
cv2.imshow('image', image_raw)
cv2.waitKey(0)
cv2.destroyAllWindows()
# VIDEO MODE
if FLAGS.mode == 'video':
if FLAGS.input.isdigit():
cap = cv2.VideoCapture(int(FLAGS.input))
else:
cap = cv2.VideoCapture(FLAGS.input)
if not cap.isOpened():
print(f"FAILED: cannot open video {FLAGS.input}")
sys.exit(1)
counter = 0
out = None
while True:
ret, frame = cap.read()
t1 = cv2.getTickCount() # 计算fps
if not ret:
print("failed to fetch next frame")
break
if counter == 0 and FLAGS.out:
print("Opening output video stream...")
fourcc = cv2.VideoWriter_fourcc('M', 'P', '4', 'V')
out = cv2.VideoWriter(FLAGS.out, fourcc, FLAGS.fps, (frame.shape[1], frame.shape[0]))
# 预处理后的图像,原图,原图的h,原图的w
input_image_buffer, image_raw, origin_h, origin_w = preprocess_image(frame, FLAGS.width, FLAGS.height)
inputs[0].set_data_from_numpy(input_image_buffer)
begin = time.time()
results = self.triton_client.infer(model_name=FLAGS.model, inputs=inputs, outputs=outputs,
client_timeout=FLAGS.client_timeout)
end = time.time()
print("infer用时:{:.2f}ms".format((end - begin) * 1000))
result = results.as_numpy('prob')
# 下面这是仿照王鑫宇的代码改的,这里的0就是batch的第一个i,
result = result[0]
result_boxes, result_scores, result_classid = post_process(
result[0 * 6001: (0 + 1) * 6001], origin_h, origin_w
)
for i in range(len(result_boxes)):
box = result_boxes[i]
plot_one_box(box, image_raw, label="{}:{:.2f}".format(
categories[int(result_classid[i])], result_scores[i])
)
counter += 1
if FLAGS.out:
out.write(image_raw)
else:
t2 = (cv2.getTickCount() - t1) / cv2.getTickFrequency()
fps = 1.0 / t2
# 感觉fps不高啊,应该还是后续做处理的时间太长了
cv2.putText(image_raw, f"fps: {fps:.2f}", (10, 30), cv2.FONT_HERSHEY_PLAIN, 2, (0, 0, 255), 2,
cv2.LINE_AA)
cv2.imshow('image', image_raw)
if cv2.waitKey(1) & 0xFF != 255:
break
if FLAGS.model_info:
statistics = self.triton_client.get_inference_statistics(model_name=FLAGS.model)
if len(statistics.model_stats) != 1:
print("FAILED: get_inference_statistics")
sys.exit(1)
print(statistics)
cap.release()
if FLAGS.out:
out.release()
else:
cv2.destroyAllWindows()
def get_args():
parser = argparse.ArgumentParser()
parser.add_argument('mode', choices=['dummy', 'image', 'video'], default='dummy',
help='Run mode. \'dummy\' will send an emtpy buffer to the server to test if inference works. \'video\' will process a video.')
parser.add_argument('input', type=str, nargs='?', help='Input file to load from in image or video mode')
parser.add_argument('-m', '--model', type=str, required=False, default='yolov5l', # 名字千万别给错了
help='Inference model name, default yolov4')
parser.add_argument('--width', type=int, required=False, default=1024,
help='Inference model input width, default 608')
parser.add_argument('--height', type=int, required=False, default=1024,
help='Inference model input height, default 608')
parser.add_argument('-u', '--url', type=str, required=False,
default='localhost:8001', help='Inference server URL, default localhost:8001')
parser.add_argument('-o', '--out', type=str, required=False, default='',
help='Write output into file instead of displaying it')
parser.add_argument('-c', '--confidence', type=float, required=False, default=0.8,
help='Confidence threshold for detected objects, default 0.8')
parser.add_argument('-n', '--nms', type=float, required=False, default=0.5,
help='Non-maximum suppression threshold for filtering raw boxes, default 0.5')
parser.add_argument('-f', '--fps', type=float, required=False, default=24.0,
help='Video output fps, default 24.0 FPS')
parser.add_argument('-i', '--model-info', action="store_true", required=False, default=False,
help='Print model status, configuration and statistics')
parser.add_argument('-v', '--verbose', action="store_true", required=False, default=False,
help='Enable verbose client output')
parser.add_argument('-t', '--client-timeout', type=float, required=False, default=None,
help='Client timeout in seconds, default no timeout')
parser.add_argument('-s', '--ssl', action="store_true", required=False, default=False,
help='Enable SSL encrypted channel to the server')
parser.add_argument('-r', '--root-certificates', type=str, required=False, default=None,
help='File holding PEM-encoded root certificates, default none')
parser.add_argument('-p', '--private-key', type=str, required=False, default=None,
help='File holding PEM-encoded private key, default is none')
parser.add_argument('-x', '--certificate-chain', type=str, required=False, default=None,
help='File holding PEM-encoded certicate chain default is none')
return parser
if __name__ == '__main__':
"""
因为在配置中把参数都设置好了,直接运行就可以了,
似乎速度不是很理想啊,
r"rtsp://192.168.108.11:554/user=admin&password=&channel=1&stream=1.sdp?"
python new_client.py video 123.mp4
"""
parser = get_args()
clint_obj = TritonClient(parser.parse_args())
clint_obj.run()
首先 TorchScript 主要是python端的使用,它负责把模型打包,然后用C++的 LibTorch加载,教程。
TorchScript打包一个模型
import torch
import torchvision
model = torchvision.models.resnet18()
example = torch.rand(1, 3, 3224, 224)
traced_model = torch.jit.trace(model, example)
traced_model.save("traced_resnet_model.pt")
还可以加在回来:
trace = torch.jit.load("traced_resnet_model.pt")
output = trace(torch.ones(1, 3, 224, 224))
print(output[0, :5])
注意点:
LibTorch加载
首先是环境问题,我用的是vscode+msvc的方式,具体的注意事项看现相关笔记吧,(注意可能出现的.dll缺失问题,powershell和clion的情况下是不会报错的,要cmd)
CMakeLists.txt
cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(custom_ops)
set(Torch_DIR "D:\\lib\\libtorch_1.8.2_debug\\share\\cmake\\Torch")
find_package(Torch REQUIRED)
add_executable(example-app example_app.cpp)
target_link_libraries(example-app "${TORCH_LIBRARIES}")
set_property(TARGET example-app PROPERTY CXX_STANDARD 14)
example_app.cpp
#include <iostream>
#include <memory>
#include <vector>
#include <torch/script.h>
const std::string model_path("../traced_resnet_model.pt");
int main(int argc, const char* argv[]) {
torch::jit::script::Module module;
try {
// module = torch::jit::load(argv[1]);
module = torch::jit::load(model_path);
}
catch (const c10::Error& e) {
std::cerr << "error loading the model\n";
return -1;
}
std::cout << "ok" << std::endl;
// create a vector of inputs
std::vector<torch::jit::IValue> inputs;
inputs.push_back(torch::ones({1, 3, 224, 224}));
// Execute the model and turn its output into a tensor
at::Tensor output = module.forward(inputs).toTensor();
std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << std::endl;
return 1;
}
1
2
3 预留的
需要下载模型,让后放到你编译好的主程序所在目录的model\patch_experts
里面
原话是:The C++ models have the .dat
extension and should be place in the lib\local\LandmarkDetector\model\patch_experts
folder if you are compiling form code, and in the model\patch_experts
folder if you have downloaded the binaries.
这里是4个模型的下载地址,直接复制这个链接去迅雷下载:
自主编译:
项目地址:https://github.com/TadasBaltrusaitis/OpenFace
根据官方的方法走就好了,比较建议的是自己先装好cmake和opencv,
在centos下:
yum search openblas
,然后在结果中yum install openblas-devel.x86_64
环境准备好,开始编译的话,进到openface:
当编译完成后,要把上面下载的四个.dat文件,放到build/bin/model/patch_experts/目录里。然后就可以用了
命令行参数地址:点这里;
Tips:装有anaconda的机器编译时会报一些错误,直接先临时把LD_LIBRARY_PATH中关于anaconda环境变量都取消掉,再编译,具体可看环境问题的md文件。
参数说明:
头方向(orientation):
从结果的csv中来看,后三个角度值,一次对应的是垂直方向的Pitch、左右偏移(Yaw)、方向偏头(Roll),主要关注的还是第二个值
视角(Gaze):
ogama(open gaze and mouse analyzer):地址:http://www.ogama.net/
opensesame
Tableau 数据可视化的,官网https://www.tableau.com/, 介绍网址:一天入门Tableau–你也可以 - Roar的文章 - 知乎 https://zhuanlan.zhihu.com/p/71502618
瑞士的Eyeware公司
数据集:在这里去搜索:https://paperswithcode.com/paper/mpiigaze-real-world-dataset-and-deep 1、gaze360 2、MPIIGaze 3、GazeCapture
这主要是OpenPose在centos上的编译注意点:
首先就跟着官方的安装步骤走,一般到cmake时都没什么问题,然后在make时就会涌现一堆错误,一一来说:
一些库,直接使用 yum search 库名,找到安装就行了
接着可能会说yum安装的boost库是1.53的,不满足最小1.54版本要求,就需要去源码编译boost,选择的是1.75版本,过高的话,因为cmake3.19版本低,会有一些不支持
再往下走,应该就会有这个错误:==Could NOT find Atlas (missing: Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY)==,即便使用yum安装了还是不行,解决办法:
在 openpose/3rdparty/caffe/cmake
这个目录下,修改 Dependencies.cmake
文件,大概在120行的位置左右:
if(BLAS STREQUAL "Atlas" OR BLAS STREQUAL "atlas")
# 这3行是原来的,我改成了最下面3行,Atlas一直有报错“Could NOT find Atlas (missing: Atlas_CBLAS_LIBRARY Atlas_BLAS_LIBRARY)”,哪怕都已经使用yum安装了
# find_package(Atlas REQUIRED)
#list(APPEND Caffe_INCLUDE_DIRS PUBLIC ${Atlas_INCLUDE_DIR})
#list(APPEND Caffe_LINKER_LIBS PUBLIC ${Atlas_LIBRARIES})
# 这三行是我加的
find_package(OpenBLAS REQUIRED)
list(APPEND Caffe_INCLUDE_DIRS PUBLIC ${OpenBLAS_INCLUDE_DIR})
list(APPEND Caffe_LINKER_LIBS PUBLIC ${OpenBLAS_LIB})
这就是altas一直有问题,直接改成openblas. (那是caffe编译时需要BLAS,而BLAS via ATLAS, MKL, or OpenBLAS.,用其中任意一个都应该可以的)
然后第9、10行滴定要同步的改, 不然只改第8行的话,可能再make到最后链接的时候会得到这样的错误:
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_ddot'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_daxpy'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_dasum'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_dcopy'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_sdot'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_dscal'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_sgemm'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_dgemm'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_sgemv'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_sscal'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_scopy'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_saxpy'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_dgemv'
../lib/libcaffe.so.1.0.0: undefined reference to `cblas_sasum'
还有可能cudnn会报错,会说相关的头文件找不到,那是一开始在安装cudann的时候,可能只复制了cudnn.h这一个文件,以后还是把cudnn下的include里的头文件全部复制
到最后,可能还会报一个错误:“fatal error: boost/shared_ptr.hpp: No such file or directory”,可是boost是正确设置了的,那就在CMakeLists.txt最前面加一句: include_directories(/opt/boost/boost_1_75_0/my_install/include) 或者是设置CPLUS的头文件搜索的路径
去看yolox的自己写的demo的脚本
openVINO的离线安装包:这里。
首先是安装:看这个网址,跟着网址教程走就好了,它会在默认的python环境装一些库,
像opencv、boost这种三方库都已经在./bashrc环境变量汇总添加了,openblas也已经yum安装了
set -e
mkdir build && cd build cmake -DBLAS=open -DBUILD_docs=OFF -DUSE_LMDB=OFF -DUSE_LEVELDB=OFF -DUSE_CUDNN=OFF .. make -j8
make install # 会安装在 ./build/install下
用它自带的makfile来的时候,可能会缺少opencv、boost的头文件这些,就设定一下环境变量,如:
export CPLUS_INCLUDE_PATH=/opt/boost/boost_1_75_0/my_install/include:$CPLUS_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=/opt/opencv-4.5.3/install/include/opencv4:$CPLUS_INCLUDE_PATH
装完后,在bin里面就有nccn要的==upgrade_net_proto_text==、==upgrade_net_proto_binary==这个用来更新模型参数的软件了
kaldi几乎是所有语音处理包的一个基础,一个中文简单说明地址。相关数据集的下载地址。
安装(针对linux的,):
它有两种方法(根目录的INSTALL
文件中有),一种是cmake的方式,出了些问题,没搞定。所以就是用的另外一种,就是直接使用它提供的脚本文件和makefile直接搞定:
先进到总目录下的==tools/==,按照里面的==INSTALL==文件做(可以结合起来看),这里直接总结:
extras/check_dependencies.sh
# 第一次可能会缺少inter的计算包mkl之类的,按照它的提示去安装就好了,搞定后再执行这个脚本,一般就会ok了。./configure
make
再进到总目录下的==src/==,按照里面的==INSTALL==文件做(可以结合起来看),这里直接总结:
然后就是根据例子准备数据,例子都是放在目录==egs/==里面的,这里以aidatatang_200zh为例,数据解读这里:
./run.sh
脚本
哎,最后还是没有搞定,好像可能跟anaconda的python环境以及自身的python环境有些冲突。
第三步是做测试了,这个好像有些复杂,所以报错了,有个简单的demo是==egs/yesno/s5==,进到里面直接./run.sh应该是能直接成功的。