(Docs) MiDAS - HackMD

--- title: (Docs) MiDAS tags: Documentation ---  # [Documentation] MiDAS Code can be downloaded here 👇 [Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer](https://github.com/isl-org/MiDaS) ## ⚙️ Environment Setup 1. Use Anaconda to create environment. ```bash conda create --name [ENVNAME] --file=environment.yaml ``` 2. Download any model. I used [dpt_swin2_large_384](https://github.com/isl-org/MiDaS/releases/download/v3_1/dpt_swin2_large_384.pt). 3. Take test image and store in ```input``` path 4. Test. ```bash """ To infer video file :param model_type: Name of the model (no need to put the path, just name) :param input_path: FOLDER of input files :param output_path: FOLDER of output :return: .pfm (depth map file) and .png """ python run.py \ --model_type dpt_swin2_large_384 \ --input_path input/ \ --output_path output/ """ To infer realtime """ python run.py --model_type dpt_swin2_large_384 --side ``` 5. **[Optional]** Error handling. ```bash # No need to run this if you have no error # ERROR 1: # FileNotFoundError: [Errno 2] No such file or directory: './externals/Next_ViT/classification/nextvit.py' ./install_next_vit.sh # ERROR 2: # ... pip install openvino ``` 6. Install ```pypfm``` package to read ```.pfm``` file ```bash pip install pypfm # pypfm documentation # https://pypi.org/project/pypfm/ ``` 7. Python script to read ```.pfm``` file ```python= from pypfm import PFMLoader path = "output/photo_1-dpt_swin2_large_384.pfm" loader = PFMLoader(color=False, compress=False) pfm = loader.load_pfm(path) print(pfm) ``` 8. Downgrade Pytorch version to match current CUDA (10.2) ```bash conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=10.2 -c pytorch ``` ## Score Version | Model | FPS | Image --- | --- | --- | --- 3.1 | dpt_swin2_large_384 | 7.9 ± 0.2 | ![](https://i.imgur.com/ncpkY67.png) 3.1 | dpt_swin2_tiny_256 | 30.0 ± 3.0 | ![](https://i.imgur.com/TvVcZ1a.png) 3.0 | dpt_hybrid_384 | 13.5 ± 1.0 | ![](https://i.imgur.com/xNud9YE.png) 3.0 | dpt_large_384 | 7.85 ± 0.1 | ![](https://i.imgur.com/M8HMcGZ.png)