当前位置:   article > 正文

JETSON ORIN环境下 CenterFormer环境配置+nuscenes训练测试(较全面)_jetson origin centerform

jetson origin centerform

Centerformer环境配置(ORIN)

自己在安装配置过程中产生了比较多的问题,奈何Orin相关的配置教程比较少,因此整理了一下解决方法供大家参考,也欢迎大家评论区交流。
官方安装教程: link
官方安装教程(nuscenes): link
本人安装环境:jetpack = 5.0.2 torch = 1.11.0 torchvision = 0.12.0
github官方所需依赖

安装依赖(x86 Linux平台安装全程无报错 arm64存在多次错误,具体解决方法如下)

参考上述安装要求
由于python 3.9版本中 torch 和 torchvision 没有对应 3.9版本的安装包
采用python 3.8版本 cuda=11.4
安装 torch = 1.11.0 torchvision = 0.12.0 (如果有需要我可以出一期教程)
下载CenterFormer代码:link
然后进行以下步骤

cd centerformer
pip install -r requirements.txt
sh setup.sh
# 记得每次运行代码和配置环境过程中要进行下面的export操作
# add CenterFormer to PYTHONPATH by adding the following line to ~/.bashrc (change the path accordingly)
# export PYTHONPATH="${PYTHONPATH}:PATH_TO_CENTERFORMER"
export PYTHONPATH="${PYTHONPATH}:/A_colorful/code/CenterFormer/centerformer"  #Orin
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

数据配置

采用nuscenes 中的mini数据集 v1.0-mini(
创建该路径 /A_colorful/code/CenterFormer/centerformer/data/nuscenes 也就是在 CenterFormer主文件夹下 创建 /data/nuscenes 文件夹,并将v1.0-mini.zip文件解压放在此文件夹中,文件夹具体内容如下(此处已经完成数据构建,正常应该只有 maps samples sweeps v1.0-mini .v1.0-mini.txt文件))
nuscene数据格式
然后进行数据构建

# nuScenes
python tools/create_data.py nuscenes_data_prep --root_path=NUSCENES_TRAINVAL_DATASET_ROOT --version="v1.0-trainval" --nsweeps=10
  • 1
  • 2

针对该语句 在本机中应该修改 --root_path=NUSCENES_TRAINVAL_DATASET_ROOT 和 --version=“v1.0-trainval” ,修改后语句如下

python tools/create_data.py nuscenes_data_prep --root_path=/A_colorful/code/CenterFormer/centerformer/data/nuscenes --version="v1.0-mini" --nsweeps=10
  • 1

完成数据构建后,文件夹中内容变为上图形式

安装 spconv(版本必须为1.x 2.x不可以) apex(可以不用安装) Tensorflow(可以不用安装)

spconv 安装环节

首先安装cumm

export CUMM_CUDA_VERSION="11.4" # 11.4为cuda版本
export CUMM_DISABLE_JIT="1" # 不用JIT编译cumm,而是编译成whl后再安装
export CUMM_CUDA_ARCH_LIST="8.7" # xavier是7.2,TX2是6.2,orin是8.7
#########此处注意 不能直接git clone  
# 打开以下网站  https://github.com/FindDefinition/cumm/tree/main 下载main分支 其他分支会报错
cd cumm-main # cd到cumm的代码根目录   文件夹位置  /A_colorful/code/CenterFormer/cumm-main/dist
python setup.py bdist_wheel # 编译生成cumm的whl在dist文件夹内
pip install dists/xxx.whl # 安装编译好的cumm的whl,名字应该类似cumm_cu114-0.2.8-cp38-cp38m-linux_aarch64.whl
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

spconv 安装错误

此处为错误示范,可忽略,并直接跳到spconv正确安装方法
完成cumm安装后 安装spconv 注意此种安装方法会自动安装 spconv = 2.3.6 导致centerformer算法在后面训练过程中 会报错,因为 spconv = 2.x 不支持 SparseConv3d模块,因此必须安装 spconv = 1.2.1

export CUMM_CUDA_VERSION="11.4" # 11.4为cuda版本
export SPCONV_DISABLE_JIT="1" # 不用JIT编译spconv,而是编译成whl后再安装
export CUMM_CUDA_ARCH_LIST="8.7" # xavier是7.2,TX2是6.2,orin是8.7
#########此处注意 不能直接git clone  
# 打开以下网站 https://github.com/traveller59/spconv 下载master分支,其他分支编译时候会报错
# git clone -b v2.1.22 https://github.com/traveller59/spconv --recursive # v2.1.22换成你想要用的github代码的spconv对应tag版本,注意需要加recursive
cd spconv # cd到spconv的代码根目录    文件夹位置    /A_colorful/code/CenterFormer/spconv-master/dist/
pip install pccm wheel # 安装一些依赖包
python setup.py bdist_wheel # 编译生成spconv的whl在dist文件夹内
pip install dist/xxx.whl # 安装编译好的spconv的whl,名字应该类似spconv_cu114-2.1.22-cp38-cp38m-linux_aarch64.whl
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

spconv正确安装方法

git clone -b v1.2.1 https://github.com/traveller59/spconv --recursive 
cd spconv # cd到spconv的代码根目录    文件夹位置  /A_colorful/code/CenterFormer/spconv-final-1.2.1
export CUMM_CUDA_VERSION="11.4" # 11.4为cuda版本
export CUMM_CUDA_ARCH_LIST="8.7" # xavier是7.2,TX2是6.2,orin是8.7
pip install pccm wheel # 安装一些依赖包
python setup.py bdist_wheel # 编译生成spconv的whl在dist文件夹内
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6

此时编译会报错,fatal error:THC/THCNumerics.cuh:No such file or directory

这是由于高版本的torch不支持 THC库,解决方法为 将调用该库的语句给注释掉

(只注释一行就行(//#include <THC/THCNumerics.cuh>),上面的一行(#include <THC/THCAtomics.cuh>)需要保留)

注释

重新进行 python setup.py bdist_wheel ,编译通过

通过后 进行安装 spconv = 1.2.1

cd dist
python setup.py bdist_wheel 
  • 1
  • 2

apex安装

直接pip无法 import,参考该blog https://blog.csdn.net/hhhhhhhhhhwwwwwwwwww/article/details/128074783 但是安装过程中报错
通过下述指令完成仅支持python的apex

git clone https://github.com/NVIDIA/apex 
cd apex 
# pip install -v --no-cache-dir ./
# 仅支持python  可以解决无法import的问题
pip install -v --disable-pip-version-check --no-build-isolation --no-cache-dir ./
  • 1
  • 2
  • 3
  • 4
  • 5

Tensorflow依赖 (如果只使用nuscene数据集无需安装)

测试和训练

分布式训练

官方指令(8GPU)

python -m torch.distributed.launch --nproc_per_node=8 ./tools/train.py CONFIG_PATH
  • 1

Orin指令(单GPU)

python -m torch.distributed.launch --nproc_per_node=1 ./tools/train.py /A_colorful/code/CenterFormer/centerformer/configs/nusc/nuscenes_centerformer_deformable_separate_detection_head.py
  • 1

分布式测试

官方指令(8GPU)

python -m torch.distributed.launch --nproc_per_node=8 ./tools/dist_test.py CONFIG_PATH --work_dir work_dirs/CONFIG_NAME --checkpoint work_dirs/CONFIG_NAME/latest.pth
  • 1

Orin指令(单GPU)

python -m torch.distributed.launch --nproc_per_node=1 ./tools/dist_test.py ./configs/nusc/nuscenes_centerformer_deformable_separate_detection_head.py --work_dir work_dirs/nuscenes_centerformer_deformable_separate_detection_head --checkpoint work_dirs/nuscenes_centerformer_deformable_separate_detection_head/latest.pth
  • 1

注意 测试过程中会多次报错 解决方案:

1.找不到 v1.0-trainval 报错如下(使用mini数据集导致)

no apex
No Tensorflow
Deformable Convolution not built!
Deformable Convolution not built!
2023-08-10 18:55:57,489 - INFO - Distributed testing: False
2023-08-10 18:55:57,490 - INFO - torch.backends.cudnn.benchmark: False
Use center number 500 in inference
Use heatmap score threshold 0.03 in inference
2023-08-10 18:55:57,702 - INFO - Finish RPN_transformer_deformable Initialization
2023-08-10 18:55:57,703 - INFO - num_classes: [1, 2, 2, 1, 2, 2]
Use HM Bias:  -2.19
2023-08-10 18:55:57,738 - INFO - Finish CenterHeadIoU Initialization
Use Val Set
use gt label assigning kernel size  1
10
parameter size: 14396822
2023-08-10 18:55:59,869 - INFO - work dir: work_dirs/nuscenes_centerformer_deformable_separate_detection_head
[                                                  ] 0/81, elapsed: 0s, ETA:/A_colorful/code/CenterFormer/centerformer/det3d/models/necks/rpn_transformer_multitask.py:677: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  y_coor = order_all // W
/home/nvidia/anaconda3/envs/centerformer/lib/python3.8/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /media/nvidia/NVME/pytorch/JetPack_5.0/pytorch-v1.11.0/aten/src/ATen/native/TensorShape.cpp:2227.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 81/81, 0.9 task/s, elapsed: 93s, ETA:     0s
 Total time per frame:  0.0
Traceback (most recent call last):
  File "./tools/dist_test.py", line 224, in <module>
    main()
  File "./tools/dist_test.py", line 214, in main
    result_dict, _ = dataset.evaluation(copy.deepcopy(predictions), output_dir=args.work_dir, testset=args.testset)
  File "/A_colorful/code/CenterFormer/centerformer/det3d/datasets/nuscenes/nuscenes.py", line 218, in evaluation
    nusc = NuScenes(version=version, dataroot=str(self._root_path), verbose=True)
  File "/home/nvidia/anaconda3/envs/centerformer/lib/python3.8/site-packages/nuscenes/nuscenes.py", line 54, in __init__
    assert osp.exists(self.table_root), 'Database version not found: {}'.format(self.table_root)
AssertionError: Database version not found: data/nuscenes/v1.0-trainval
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33

解决方法 修改/A_colorful/code/CenterFormer/centerformer/det3d/datasets/nuscenes/nuscenes.py文件中的 version=“v1.0-trainval” 为 version=“v1.0-mini”,可解决上述问题,如果仍存在问题,那就把 /A_colorful/code/CenterFormer/centerformer/tools/nusc_tracking/pub_test.py文件中的三个 v1.0-trainval 全部改为 v1.0-mini

  1. numpy中不支持 float形式 需要改为 float64
    报错如下
parameter size: 14396822
2023-08-10 18:59:48,162 - INFO - work dir: work_dirs/nuscenes_centerformer_deformable_separate_detection_head
[                                                  ] 0/81, elapsed: 0s, ETA:/A_colorful/code/CenterFormer/centerformer/det3d/models/necks/rpn_transformer_multitask.py:677: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  y_coor = order_all // W
/home/nvidia/anaconda3/envs/centerformer/lib/python3.8/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /media/nvidia/NVME/pytorch/JetPack_5.0/pytorch-v1.11.0/aten/src/ATen/native/TensorShape.cpp:2227.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 81/81, 0.9 task/s, elapsed: 95s, ETA:     0s
 Total time per frame:  0.0
======
Loading NuScenes tables for version v1.0-mini...
23 category,
8 attribute,
4 visibility,
911 instance,
12 sensor,
120 calibrated_sensor,
31206 ego_pose,
8 log,
10 scene,
404 sample,
31206 sample_data,
18538 sample_annotation,
4 map,
Done loading in 1.1 seconds.
======
Reverse indexing ...
Done reverse indexing in 0.2 seconds.
======
Finish generate predictions for testset, save to work_dirs/nuscenes_centerformer_deformable_separate_detection_head/infos_val_10sweeps_withvelo_filter_True.json
Initializing nuScenes detection evaluation
Loaded results from work_dirs/nuscenes_centerformer_deformable_separate_detection_head/infos_val_10sweeps_withvelo_filter_True.json. Found detections for 81 samples.
Loading annotations for mini_val split from nuScenes version: v1.0-mini
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 81/81 [00:00<00:00, 89.95it/s]
Loaded ground truth annotations for 81 samples.
Filtering predictions
=> Original number of boxes: 18450
=> After distance based filtering: 13167
=> After LIDAR points based filtering: 13167
=> After bike rack filtering: 13151
Filtering ground truth annotations
=> Original number of boxes: 4441
=> After distance based filtering: 3785
=> After LIDAR points based filtering: 3393
=> After bike rack filtering: 3393
Rendering sample token b6c420c3a5bd4a219b1cb82ee5ea0aa7
Rendering sample token b22fa0b3c34f47b6a360b60f35d5d567
Rendering sample token d8251bbc2105497ab8ec80827d4429aa
Rendering sample token 372725a4b00e49c78d6d0b1c4a38b6e0
Rendering sample token ce94ef7a0522468e81c0e2b3a2f1e12d
Rendering sample token 0d0700a2284e477db876c3ee1d864668
Rendering sample token 61a7bd24f88a46c2963280d8b13ac675
Rendering sample token fa65a298c01f44e7a182bbf9e5fe3697
Rendering sample token 8573a885a7cb41d185c05029eeb9a54e
Rendering sample token 38a28a3aaf2647f2a8c0e90e31267bf8
Accumulating metric data...
Traceback (most recent call last):
  File "./tools/dist_test.py", line 224, in <module>
    main()
  File "./tools/dist_test.py", line 214, in main
    result_dict, _ = dataset.evaluation(copy.deepcopy(predictions), output_dir=args.work_dir, testset=args.testset)
  File "/A_colorful/code/CenterFormer/centerformer/det3d/datasets/nuscenes/nuscenes.py", line 288, in evaluation
    eval_main(
  File "/A_colorful/code/CenterFormer/centerformer/det3d/datasets/nuscenes/nusc_common.py", line 521, in eval_main
    metrics_summary = nusc_eval.main(plot_examples=10,)
  File "/home/nvidia/anaconda3/envs/centerformer/lib/python3.8/site-packages/nuscenes/eval/detection/evaluate.py", line 204, in main
    metrics, metric_data_list = self.evaluate()
  File "/home/nvidia/anaconda3/envs/centerformer/lib/python3.8/site-packages/nuscenes/eval/detection/evaluate.py", line 116, in evaluate
    md = accumulate(self.gt_boxes, self.pred_boxes, class_name, self.cfg.dist_fcn_callable, dist_th)
  File "/home/nvidia/anaconda3/envs/centerformer/lib/python3.8/site-packages/nuscenes/eval/detection/algo.py", line 133, in accumulate
    tp = np.cumsum(tp).astype(np.float)
  File "/home/nvidia/anaconda3/envs/centerformer/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74

解决方案 gedit /home/nvidia/anaconda3/envs/centerformer/lib/python3.8/site-packages/nuscenes/eval/detection/algo.py
修改 tp = np.cumsum(tp).astype(np.float) 为 tp = np.cumsum(tp).astype(np.float64)

以上就是本人安装过程中踩过的坑和解决方案,因为也是刚刚进行该领域研究,欢迎大家私信添加好友或评论区交流!

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/码创造者/article/detail/857987
推荐阅读
相关标签
  

闽ICP备14008679号