Cài đặt DAB-DETR

Hồ Chí Minh, 16-08-2023 [Võ Duy Nguyên](https://nguyenvd-uit.github.io/), [Lê Duy Nguyên](https://github.com/Ngyyen), [UIT-Together Research Group](https://uit-together.github.io/) # DAB-DETR: Dynamic Anchor Boxes are better queries for DETR ## Mục Lục [TOC] ## Step 1. Cài đặt môi trường ### Step 1.1. Tạo môi trường anaconda Đặt tên theo cú pháp: Tên viết tắt của họ và chữ lót VD: Le Duy Nguyen -> Nguyenld ```gherkin= conda create --name UITTogether python=3.8 -y ``` Hình ảnh sau khi tạo môi trường ![](https://hackmd.io/_uploads/BkWrrM923.png) ### Step 1.2. Kích hoạt môi trường vừa tạo ```gherkin= conda activate UITTogether ``` ![](https://hackmd.io/_uploads/r1HeBzq33.png) ### Step 1.3. Cài đặt PyTorch trên GPU platforms ```gherkin= pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 ``` Hình ảnh sau khi cài đặt thành công ![](https://hackmd.io/_uploads/rJbZuzc3h.png) ## Step 2. Cài đặt detrex và detectron2 Truy cập vào thư mục LuuTru VD: /home/cvpr2023/LuuTru/ ```gherkin= cd LuuTru/ ``` Tạo thư mục tương ứng với tên môi trường bên trên ![](https://hackmd.io/_uploads/ByVGkxKih.png) ```gherkin= cd UITTogether/ ``` ### Step 2.1. Cài đặt detrex Tại thư mục này thực hiện clone và cài đặt detrex ```gherkin= git clone https://github.com/IDEA-Research/detrex.git cd detrex ``` Hình ảnh sau khi clone thành công ![](https://hackmd.io/_uploads/HyAhBz9hn.png) ![](https://hackmd.io/_uploads/r1VN8Mq22.png) ### Step 2.2. Khởi tạo submodule detectron2 ```gherkin= git submodule init git submodule update ``` Hình ảnh sau khi khởi tạo thành công ![](https://hackmd.io/_uploads/ByQPdfqhn.png) ### Step 2.3. Cài đặt detectron2 ```gherkin= python -m pip install -e detectron2 ``` Hình ảnh sau khi cài đặt thành công ![](https://hackmd.io/_uploads/rJxTYG5n2.png) ### Step 2.4. Build một phiên bản chỉnh sửa được của detrex ```gherkin= pip install -e . ``` Hình ảnh sau khi build thành công ![](https://hackmd.io/_uploads/HJ-Y9Gqnh.png) ## Step 3. Verify the installation ### Step 3.1. Tải pretrained model và ảnh demo ```gherkin= # download pretrained DAB-DETR model wget https://github.com/IDEA-Research/detrex-storage/releases/download/v0.1.0/dab_detr_r50_50ep.pth # download the demo image wget https://github.com/IDEA-Research/detrex-storage/releases/download/v0.2.1/idea.jpg ``` Hình ảnh sau khi tải thành công pretrained model và ảnh demo ![](https://hackmd.io/_uploads/Hy1j3zq32.png) ### Step 3.2. Chạy inference pretrained model trên ảnh demo. ```gherkin= python demo/demo.py --config-file projects/dab_detr/configs/dab_detr_r50_50ep.py \ --input "./idea.jpg" \ --output "./demo_output.jpg" \ --opts train.init_checkpoint="./dab_detr_r50_50ep.pth" ``` Kết quả được lưu trong file demo_output.jpg Vd: /home/cvpr2023/LuuTru/UITTogether/detrex/demo_output.jpg ![](https://hackmd.io/_uploads/H1VyJQc23.jpg) ### Step 3.3. Chạy đánh giá pretrained model trên bộ dữ liệu COCO 2017. Các bạn tải bộ dữ liệu COCO 2017 và lưu nó ở địa chỉ: "/home/cvpr2023/LuuTru/dataset/coco/", ở đây ta chỉ cần các folder ảnh là train2017, val2017 và folder anotations chứa file json tương ứng với từng folder ảnh. ![](https://hackmd.io/_uploads/rk5z-Q9hh.png) Bộ dữ liệu COCO 2017 được tổ chức theo format COCO với cấu trúc như sau: ```gherkin= coco/ annotations/ instances_{train,val}2017.json ... {train,val}2017/ # folder ảnh tương ứng với các file json ``` Câu lệnh dùng để chạy đánh giá: ```gherkin= export DETECTRON2_DATASETS=/home/cvpr2023/LuuTru/dataset/ python tools/train_net.py --config-file projects/dab_detr/configs/dab_detr_r50_50ep.py \ --eval-only \ train.init_checkpoint="./dab_detr_r50_50ep.pth" ``` Kết quả chạy đánh giá sẽ xấp xỉ với các giá trị trong bảng dưới đây: ![](https://hackmd.io/_uploads/rkoVSQ5h2.png) ## Step 4. Train model trên các bộ dữ liệu theo format COCO ### Step 4.1. Train trên bộ dữ liệu COCO 2017 đã được tổ chức sẵn theo format COCO và được bộ công cụ chuẩn bị sẵn cấu hình Câu lệnh thực hiện: ```gherkin= python tools/train_net.py \ --config-file projects/dab_detr/configs/dab_detr_r50_50ep.py ``` Nếu bị lỗi "CUDA out of memory" thì có thể vào file config tại địa chỉ "projects/dab_detr/configs/dab_detr_r50_50ep.py" và sửa biến dataloader.train.total_batch_size thành 1. Màn hình hiện ra những dòng thông báo như dưới đây tức là đã bắt đầu train được ![](https://hackmd.io/_uploads/Bk1DCbj2h.png) Nếu chỉnh batch_size thành 1 mà vẫn bị lỗi "CUDA out of memory" thì hết cứu:> Trong quá trình train, các file checkpoint sẽ được lưu tại địa chỉ output/dab_detr_r50_50ep/ ![](https://hackmd.io/_uploads/rJMXfzinn.png) ### Step 4.2. Train trên một bộ dữ liệu mới tùy chọn #### Step 4.2.1. Chuẩn bị bộ dữ liệu và tiền xử lý Trong ví dụ này, chúng ta sẽ chọn bộ dữ liệu VisDrone 2019 để train model mới dùng phương pháp DAB-DETR. Sau khi tải bộ dữ liệu về, bước đầu tiên là phải chuyển đổi các file annotation theo format của COCO. Mỗi bộ dữ liệu có một format annotation ban đầu khác nhau, các bạn cần tham khảo thêm trên mạng để tìm cách chuyển, có thể sử dụng code có sẵn trên github hoặc các tool hỗ trợ như roboflow. Sau khi đã chuyển đổi bộ dữ liệu theo format của COCO, các bạn tải lên bộ dữ liệu tại địa chỉ: dataset/VisDrone/cocoVisdrone/ ![](https://hackmd.io/_uploads/r1BmnQs33.png) #### Step 4.2.2. Chuẩn bị file config Đến địa chỉ "projects/dab_detr/configs/" và tạo một file config có tên "dab_detr_r50_visdrone_1ep" với nội dung như sau: ```gherkin= from detrex.config import get_config from .models.dab_detr_r50 import model import detectron2.data.transforms as T from detectron2.config import LazyCall as L from detectron2.data import ( build_detection_test_loader, build_detection_train_loader, get_detection_dataset_dicts, MetadataCatalog, ) from detectron2.evaluation import COCOEvaluator from detrex.modeling import HungarianMatcher, SetCriterion from detrex.data import DetrDatasetMapper from detectron2.data.datasets import register_coco_instances # register dataset register_coco_instances("VisDrone_train", {}, "/home/cvpr2023/LuuTru/dataset/VisDrone/cocoVisdrone/annotations/train.json", "/home/cvpr2023/LuuTru/dataset/VisDrone/cocoVisdrone/train/") register_coco_instances("VisDrone_val", {}, "/home/cvpr2023/LuuTru/dataset/VisDrone/cocoVisdrone/annotations/val.json", "/home/cvpr2023/LuuTru/dataset/VisDrone/cocoVisdrone/val/") register_coco_instances("VisDrone_test_dev", {}, "/home/cvpr2023/LuuTru/dataset/VisDrone/cocoVisdrone/annotations/test.json", "/home/cvpr2023/LuuTru/dataset/VisDrone/cocoVisdrone/test/") # register metadata MetadataCatalog.get("VisDrone_train").thing_classes = ['ignored regions', 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor', 'others'] MetadataCatalog.get("VisDrone_val").thing_classes = ['ignored regions', 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor', 'others'] MetadataCatalog.get("VisDrone_test_dev").thing_classes = ['ignored regions', 'pedestrian', 'people', 'bicycle', 'car', 'van', 'truck', 'tricycle', 'awning-tricycle', 'bus', 'motor', 'others'] #load base config from another file dataloader = get_config("common/data/coco_detr.py").dataloader optimizer = get_config("common/optim.py").AdamW lr_multiplier = get_config("common/coco_schedule.py").lr_multiplier_12ep train = get_config("common/train.py").train # initialize checkpoint to be loaded train.init_checkpoint = "./dab_detr_r50_50ep.pth" train.output_dir = "./output/VisDrone_dab_detr_r50_1ep" # max training iterations train.max_iter = 6500 # run evaluation every 1300 iters train.eval_period = 1300 # log training infomation every 100 iters train.log_period = 100 # save checkpoint every 1300 iters train.checkpointer.period = 1300 # gradient clipping for training train.clip_grad.enabled = True train.clip_grad.params.max_norm = 0.1 train.clip_grad.params.norm_type = 2 # set training devices train.device = "cuda" model.device = train.device # modify optimizer config optimizer.lr = 1e-4 optimizer.betas = (0.9, 0.999) optimizer.weight_decay = 1e-4 optimizer.params.lr_factor_func = lambda module_name: 0.1 if "backbone" in module_name else 1 # modify dataloader config dataloader.train.num_workers = 4 # please notice that this is total batch size. # surpose you're using 4 gpus for training and the batch size for # each gpu is 16/4 = 4 dataloader.train.total_batch_size = 1 # dump the testing results into output_dir for visualization dataloader.evaluator.output_dir = train.output_dir # change load dataset dataloader.train.dataset=L(get_detection_dataset_dicts)(names="VisDrone_train") dataloader.test.dataset=L(get_detection_dataset_dicts)(names="VisDrone_test_dev", filter_empty=False) # change num class model.num_classes = 12 model.criterion=L(SetCriterion) ( num_classes=12, matcher=L(HungarianMatcher)( cost_class=2.0, cost_bbox=5.0, cost_giou=2.0, cost_class_type="focal_loss_cost", alpha=0.25, gamma=2.0, ), weight_dict={ "loss_class": 1, "loss_bbox": 5.0, "loss_giou": 2.0, }, loss_class_type="focal_loss", alpha=0.25, gamma=2.0, ) ``` Thực hiện các dòng lệnh sau để bắt đầu train: ```gherkin= export DETECTRON2_DATASETS=/home/cvpr2023/LuuTru/dataset/VisDrone/cocoVisdrone/ python tools/train_net.py \ --config-file projects/dab_detr/configs/dab_detr_r50_visdrone_1ep.py \ ``` Sau khi train xong, dùng file checkpoint model_final.pth được lưu ở địa chỉ "output/dab_detr_VisDrone_1ep/" để chạy đánh giá. ![](https://hackmd.io/_uploads/rk_h7Hjh2.png) ```gherkin= python tools/train_net.py --config-file "projects/dab_detr/configs/dab_detr_r50_visdrone_1ep.py" \ --eval-only \ train.init_checkpoint="output/dab_detr_VisDrone_1ep/model_final.pth" ``` Lưu ý: Khi chạy inference một model đã được huấn luyện trên bộ dữ liệu mới, cần truyền thêm tham số metadata_datasets để visualize ra đúng nhãn của bounding box ```gherkin= python demo/demo.py --config-file "./projects/dab_detr/configs/VisDrone_dab_detr_r50_1ep.py" \ --input "/home/cvpr2023/LuuTru/dataset/VisDrone/cocoVisdrone/train/0000002_00005_d_0000014.jpg" \ --output "visualized_results.jpg" \ --metadata_dataset "VisDrone_train" \ --opts train.init_checkpoint="./output/VisDrone_dab_detr_r50_1ep/model_final.pth" ```

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.