Centerfusion on Windows

# Centerfusion on Windows ###### tags: `Centerfusion` ```gherkin= conda create -n env1 python=3.7 conda activate env1 pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html pip install cython pip install git+https://github.com/philferriere/cocoapi.git#subdirectory=PythonAPI git clone https://github.com/mrnabati/CenterFusion.git cd CenterFusion pip install -r requirements.txt git clone https://github.com/lbin/DCNv2.git # DCNv2 要放進 D:\CenterFusion\src\lib\model\networks cd D:\CenterFusion\src\lib\model\networks\DCNv2 sh make.sh ``` 最後出現 Finished processing dependencies for DCNv2==0.1 為成功 ## 數據下载官網 : https://www.nuscenes.org/download ![](https://i.imgur.com/VngLM8v.png) CenterFusion/data 路徑下新建一個【nuscenes】文件夾把剛剛的資料解壓縮進去 ### 修改在 CenterFusion/src/tools/convert_nuScenes.py 文件中要做以下修改 #### ( 1 ) splits改為剩這兩行 ![](https://i.imgur.com/eP41o7s.png) #### ( 2 ) num_sweeps=3 ![](https://i.imgur.com/KhpMzKB.png) 接著運行代碼 ```gherkin= cd D:\CenterFusion\src\tools python convert_nuScenes.py ``` 成功後如下多了最上面那個資料夾 ![](https://i.imgur.com/jqbKJH8.png) ## DEBUG可視化把剛剛的 convert_nuScenes.py 裡的 DEBUG 改成 True ![](https://i.imgur.com/iz72mzc.png) 還有這裡 **pc_3d 改為 radar_pc**，因為 image_info 沒有屬性 pc_3d，而是 radar_pc ![](https://i.imgur.com/ot5kpSf.png) 執行 convert_nuScenes.py 之後，會在同目錄下生成 3 張圖片，一個是點雲圖，一個是 3D Box 框，還有一個是 nuscenes-devkit 庫生成的可視化圖片 ## 預訓練模型下載下載兩個模型： ①centerfusion_e60：[載點](https://drive.google.com/uc?export=download&id=1XaYx7JJJmQ6TBjCJJster-Z7ERyqu4Ig) ②centernet_baseline_e170：[載點](https://drive.google.com/uc?export=download&id=1iFF7a5oueFfB5GnUoHFDnntFdTst-bVI) 然後將其解壓縮到 CenterFusion/models 文件夾中即可 ## Train CenterFusion/src/main.py 項目訓練執行文件詳解 : [可乐有点好喝](https://blog.csdn.net/ssj925319/article/details/124502583) 參數修改：在訓練之前需要修改 train.sh 中的參數（1）設置訓練集 train_split 的值，可選 train、mini_train 這兩個數據集，我選mini （2）設置測試集 val_split 的值，可選 val、mini_val、test 這三個數據集（2）batch size 自行調整（3）只配置了一個 gpu，所以參數由 gpu 0,1 改為 gpu 0 ，用了兩個 GPU 的話可以不用改 ![](https://i.imgur.com/xxK8YcR.png) ### debug 打開 envs/env1/lib/subprocess.py文件，修改第 410 行，將 check=True 改為 check=False 即可 (在你用的環境中找出subprocess.py來改，路徑不一定是這樣) ![](https://i.imgur.com/HQfHEHI.png) CenterFusion/src/tools/nuscenes-devkit 這個文件夾是空的，要自己手動去下載 ```gherkin= git clone https://github.com/nutonomy/nuscenes-devkit.git #取代原本的nuscenes-devkit ``` CenterFusion/src/lib/logger.py 文件中第 52 行 'w' 改為 'a' ```gherkin= self.log = open(log_dir + '/log.txt', 'w') ``` 這是是以寫的方式打開，每次寫的時候都會覆蓋原來文件中的內容，應該以追加的方式打開，應該將**參數 'w' 改為 'a'** 接下來開始訓練 ```gherkin= cd CenterFusion sh experiments/train.sh ``` **RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED** 出現這個的話就是cuda cudnn pytorch不匹配我的原因是雖然使用了下面的代碼 ```gherkin= pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html ``` 但是在環境中用下列代碼檢查後發現實際上下載到別的torchversion ```gherkin= cd\ python import torchvision torchvision.__version__#檢查torchversion版本 import torch torch.__version__#檢查pytorch版本 exit()#離開python ``` 使用以下代碼在環境中刪除並重新下載 ```gherkin= pip uninstall torchvision pip install torchvision==0.8.2+cu110 -f https://download.pytorch.org/whl/torch_stable.html ``` ### 訓練過程可視化在 D:\CenterFusion\exp\ddd\centerfusion 中的 logs 文件可用來可視化 ![](https://i.imgur.com/K6fzwgx.png) ```gherkin= cd\ tensorboard --logdir=D:\CenterFusion\exp\ddd\centerfusion\logs_2022-08-04-13-37 ``` 然後會出現如下圖所示，注意紅框內容，接入這個默認端口就可以看到可視化內容 ![](https://i.imgur.com/by271MA.png) 複製網址貼到網頁就能看到訓練的各項變化 ![](https://i.imgur.com/ZpZ4zxR.png) AP：平均精度（平均檢測正確率） LR：學習率 Scores： NDS：mAP和錯誤度量的加權和 attr_err：平均屬性誤差 orient_err：平均方向誤差 scale_err：平均尺度誤差 trans_err：平均平移誤差 vel_err：平均速度誤差 ## Test CenterFusion/src/test.py 項目驗證執行文件詳解 : [可乐有点好喝](https://blog.csdn.net/ssj925319/article/details/124640954) :+1: **( 1 ) 第一行的值要和你設定的 gpus 一樣** **( 2 ) 設置驗證集 val_split 的值，可選 val、mini_val、test** ### 可視化 ( 選擇性，會降低 test 速度 ) D:\CenterFusion\experiments\test.sh 最後一行新增 `-- debug 4 \` ![](https://i.imgur.com/aLqAf3d.png) 修改 CenterFusion/src/lib/utils/debugger.py 第 424 行，如下，把四個 rect 值轉成 int ![](https://i.imgur.com/qOcOKDX.png) ### debug :::warning 在 Windows 中，多進程 multiprocessing 使用的是序列化 pickle 在多進程之間轉移數據，而 socket 對像是不能被序列化的，但是在 linux 操作系統上卻沒問題，因為在 linux 上多進程 multiprocessing 使用的是 fork，所以在 windows 上可以改用多線程。因為網絡通信屬於 io 密集型的操作，對 cpu 計算要求不高，**因此不用多進程，用多線程就行**。 ::: ``` TypeError: can't pickle module objects [W ..\torch\csrc\CudaIPCTypes.cpp:22] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors] (env1) D:\CenterFusion>Using tensorboardX C:\Users\User\anaconda3\envs\env1\lib\site-packages\sklearn\utils\linear_assignment_.py:21: DeprecationWarning: The linear_assignment_ module is deprecated in 0.21 and will be removed from 0.23. Use scipy.optimize.linear_sum_assignment instead. DeprecationWarning) Traceback (most recent call last): File "<string>", line 1, in <module> File "C:\Users\User\anaconda3\envs\env1\lib\multiprocessing\spawn.py", line 105, in spawn_main exitcode = _main(fd) File "C:\Users\User\anaconda3\envs\env1\lib\multiprocessing\spawn.py", line 115, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input ``` 作了以下修改 #### ( 1 ) test.sh 中的 num_workers 設 0 (單進程) #### ( 2 ) D:\CenterFusion\src\lib\opts.py 中的 num_workers 預設改為 0 ![](https://i.imgur.com/1m3lI1s.png) 做完這兩個一樣沒成功後來發現還有這要改 #### ( 3 ) D:\CenterFusion\src\test.py 中的 num_workers = 0 ![](https://i.imgur.com/jPoADLH.png) 完成 debug #### 執行 test.sh ```gherkin= cd CenterFusion sh experiments\test.sh ``` ![](https://i.imgur.com/l5ARJv8.png) ![](https://i.imgur.com/1VgTqWz.png) ## 完整訓練 ### 下載官網 : https://www.nuscenes.org/download 需要下載 Metadata，然後每個 part 只需要下載 Keyframe（關鍵幀數據）、Radar（毫米波雷達數據）、Camera（相機數據）這三個，Lidar（激光雷達數據）就不用下載，**共 10 個 part**。 ![](https://i.imgur.com/14hyInZ.png) ### Test 部分：下载 Metadata，再下载 Radar、Camera 即可 ![](https://i.imgur.com/nDfLnVb.png) 先在 CenterFusion/data 路徑下新建一個【nuscenes】文件夾，然後將 tgz 壓縮包解壓到 nuscenes 文件夾中，它會自動分配到正確的位置。 ``` tar -zxvf .tgz壓縮包 -C 路徑 ``` **解壓完後 nuscenes 中的格式如下** ![](https://i.imgur.com/jZJ57ZQ.png) ### 源碼修改在 CenterFusion/src/tools/convert_nuScenes.py 中 27 行要做以下修改 ``` SPLITS = { 'mini_val': 'v1.0-mini', 'mini_train': 'v1.0-mini', 'train': 'v1.0-trainval', 'val': 'v1.0-trainval', 'test': 'v1.0-test', } ``` 接著執行 convert_nuScenes.py 以完成轉換最後會生成路徑 CenterFusion/data/nuscenes/annotations_3sweeps，該文件夾下存放著 COCO 格式的數據集接著就可執行完整 datasets 的 train & test